{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "name": "09_SkimLit_nlp_milestone_project_2.ipynb",
      "provenance": [],
      "collapsed_sections": [],
      "toc_visible": true,
      "mount_file_id": "1_yq3R-ThKP78_byQV9OovntEcXpQcBK2",
      "authorship_tag": "ABX9TyPhrY1FRJwt9IJ5LJn2QHhH",
      "include_colab_link": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/mrdbourke/tensorflow-deep-learning/blob/main/09_SkimLit_nlp_milestone_project_2.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dDWUcMGOauy8"
      },
      "source": [
        "# Milestone Project 2: SkimLit 📄🔥\n",
        "\n",
        "In the previous notebook ([NLP fundamentals in TensorFlow](https://github.com/mrdbourke/tensorflow-deep-learning/blob/main/08_introduction_to_nlp_in_tensorflow.ipynb)), we went through some fundamental natural lanuage processing concepts. The main ones being **tokenzation** (turning words into numbers) and **creating embeddings** (creating a numerical representation of words).\n",
        "\n",
        "In this project, we're going to be putting what we've learned into practice.\n",
        "\n",
        "More specificially, we're going to be replicating the deep learning model behind the 2017 paper [*PubMed 200k RCT: a Dataset for Sequenctial Sentence Classification in Medical Abstracts*](https://arxiv.org/abs/1710.06071).\n",
        "\n",
        "When it was released, the paper presented a new dataset called PubMed 200k RCT which consists of ~200,000 labelled Randomized Controlled Trial (RCT) abstracts.\n",
        "\n",
        "The goal of the dataset was to explore the ability for NLP models to classify sentences which appear in sequential order.\n",
        "\n",
        "In other words, given the abstract of a RCT, what role does each sentence serve in the abstract?\n",
        "\n",
        "![Skimlit example inputs and outputs](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/09-skimlit-overview-input-and-output.png)\n",
        "\n",
        "*Example inputs ([harder to read abstract from PubMed](https://pubmed.ncbi.nlm.nih.gov/28942748/)) and outputs ([easier to read abstract](https://pubmed.ncbi.nlm.nih.gov/32537182/)) of the model we're going to build. The model will take an abstract wall of text and predict the section label each sentence should have.*  \n",
        "\n",
        "### Model Input\n",
        "\n",
        "For example, can we train an NLP model which takes the following input (note: the following sample has had all numerical symbols replaced with \"@\"):\n",
        "\n",
        "> To investigate the efficacy of @ weeks of daily low-dose oral prednisolone in improving pain , mobility , and systemic low-grade inflammation in the short term and whether the effect would be sustained at @ weeks in older adults with moderate to severe knee osteoarthritis ( OA ). A total of @ patients with primary knee OA were randomized @:@ ; @ received @ mg/day of prednisolone and @ received placebo for @ weeks. Outcome measures included pain reduction and improvement in function scores and systemic inflammation markers. Pain was assessed using the visual analog pain scale ( @-@ mm ).\n",
        "Secondary outcome measures included the Western Ontario and McMaster Universities Osteoarthritis Index scores , patient global assessment ( PGA ) of the severity of knee OA , and @-min walk distance ( @MWD ).,\n",
        "Serum levels of interleukin @ ( IL-@ ) , IL-@ , tumor necrosis factor ( TNF ) - , and high-sensitivity C-reactive protein ( hsCRP ) were measured.\n",
        "There was a clinically relevant reduction in the intervention group compared to the placebo group for knee pain , physical function , PGA , and @MWD at @ weeks. The mean difference between treatment arms ( @ % CI ) was @ ( @-@ @ ) , p < @ ; @ ( @-@ @ ) , p < @ ; @ ( @-@ @ ) , p < @ ; and @ ( @-@ @ ) , p < @ , respectively. Further , there was a clinically relevant reduction in the serum levels of IL-@ , IL-@ , TNF - , and hsCRP at @ weeks in the intervention group when compared to the placebo group. These differences remained significant at @ weeks. The Outcome Measures in Rheumatology Clinical Trials-Osteoarthritis Research Society International responder rate was @ % in the intervention group and @ % in the placebo group ( p < @ ). Low-dose oral prednisolone had both a short-term and a longer sustained effect resulting in less knee pain , better physical function , and attenuation of systemic inflammation in older patients with knee OA ( ClinicalTrials.gov identifier NCT@ ).\n",
        "\n",
        "### Model output\n",
        "\n",
        "And returns the following output:\n",
        "\n",
        "```\n",
        "['###24293578\\n',\n",
        " 'OBJECTIVE\\tTo investigate the efficacy of @ weeks of daily low-dose oral prednisolone in improving pain , mobility , and systemic low-grade inflammation in the short term and whether the effect would be sustained at @ weeks in older adults with moderate to severe knee osteoarthritis ( OA ) .\\n',\n",
        " 'METHODS\\tA total of @ patients with primary knee OA were randomized @:@ ; @ received @ mg/day of prednisolone and @ received placebo for @ weeks .\\n',\n",
        " 'METHODS\\tOutcome measures included pain reduction and improvement in function scores and systemic inflammation markers .\\n',\n",
        " 'METHODS\\tPain was assessed using the visual analog pain scale ( @-@ mm ) .\\n',\n",
        " 'METHODS\\tSecondary outcome measures included the Western Ontario and McMaster Universities Osteoarthritis Index scores , patient global assessment ( PGA ) of the severity of knee OA , and @-min walk distance ( @MWD ) .\\n',\n",
        " 'METHODS\\tSerum levels of interleukin @ ( IL-@ ) , IL-@ , tumor necrosis factor ( TNF ) - , and high-sensitivity C-reactive protein ( hsCRP ) were measured .\\n',\n",
        " 'RESULTS\\tThere was a clinically relevant reduction in the intervention group compared to the placebo group for knee pain , physical function , PGA , and @MWD at @ weeks .\\n',\n",
        " 'RESULTS\\tThe mean difference between treatment arms ( @ % CI ) was @ ( @-@ @ ) , p < @ ; @ ( @-@ @ ) , p < @ ; @ ( @-@ @ ) , p < @ ; and @ ( @-@ @ ) , p < @ , respectively .\\n',\n",
        " 'RESULTS\\tFurther , there was a clinically relevant reduction in the serum levels of IL-@ , IL-@ , TNF - , and hsCRP at @ weeks in the intervention group when compared to the placebo group .\\n',\n",
        " 'RESULTS\\tThese differences remained significant at @ weeks .\\n',\n",
        " 'RESULTS\\tThe Outcome Measures in Rheumatology Clinical Trials-Osteoarthritis Research Society International responder rate was @ % in the intervention group and @ % in the placebo group ( p < @ ) .\\n',\n",
        " 'CONCLUSIONS\\tLow-dose oral prednisolone had both a short-term and a longer sustained effect resulting in less knee pain , better physical function , and attenuation of systemic inflammation in older patients with knee OA ( ClinicalTrials.gov identifier NCT@ ) .\\n',\n",
        " '\\n']\n",
        " ```\n",
        "\n",
        "### Problem in a sentence\n",
        "\n",
        "The number of RCT papers released is continuing to increase, those without structured abstracts can be hard to read and in turn slow down researchers moving through the literature. \n",
        "\n",
        "### Solution in a sentence\n",
        "\n",
        "Create an NLP model to classify abstract sentences into the role they play (e.g. objective, methods, results, etc)  to enable researchers to skim through the literature (hence SkimLit 🤓🔥) and dive deeper when necessary.\n",
        "\n",
        "> 📖 **Resources:** Before going through the code in this notebook, you might want to get a background of what we're going to be doing. To do so, spend an hour (or two) going through the following papers and then return to this notebook:\n",
        "1. Where our data is coming from: [*PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts*](https://arxiv.org/abs/1710.06071)\n",
        "2. Where our model is coming from: [*Neural networks for joint sentence\n",
        "classification in medical paper abstracts*](https://arxiv.org/pdf/1612.05251.pdf).\n",
        "\n",
        "## What we're going to cover\n",
        "\n",
        "Time to take what we've learned in the NLP fundmentals notebook and build our biggest NLP model yet:\n",
        "\n",
        "* Downloading a text dataset ([PubMed RCT200k from GitHub](https://github.com/Franck-Dernoncourt/pubmed-rct))\n",
        "* Writing a preprocessing function to prepare our data for modelling\n",
        "* Setting up a series of modelling experiments\n",
        "  * Making a baseline (TF-IDF classifier)\n",
        "  * Deep models with different combinations of: token embeddings, character embeddings, pretrained embeddings, positional embeddings\n",
        "* Building our first multimodal model (taking multiple types of data inputs)\n",
        "  * Replicating the model architecture from https://arxiv.org/pdf/1612.05251.pdf \n",
        "* Find the most wrong predictions\n",
        "* Making predictions on PubMed abstracts from the wild\n",
        "\n",
        "## How you should approach this notebook\n",
        "\n",
        "You can read through the descriptions and the code (it should all run, except for the cells which error on purpose), but there's a better option.\n",
        "\n",
        "Write all of the code yourself.\n",
        "\n",
        "Yes. I'm serious. Create a new notebook, and rewrite each line by yourself. Investigate it, see if you can break it, why does it break?\n",
        "\n",
        "You don't have to write the text descriptions but writing the code yourself is a great way to get hands-on experience.\n",
        "\n",
        "Don't worry if you make mistakes, we all do. The way to get better and make less mistakes is to write more code.\n",
        "\n",
        "> 📖 **Resource:** See the full set of course materials on GitHub: https://github.com/mrdbourke/tensorflow-deep-learning"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4NG3nevdEZBs"
      },
      "source": [
        "## Confirm access to a GPU\n",
        "\n",
        "Since we're going to be building deep learning models, let's make sure we have a GPU.\n",
        "\n",
        "In Google Colab, you can set this up by going to Runtime -> Change runtime type -> Hardware accelerator -> GPU.\n",
        "\n",
        "If you don't have access to a GPU, the models we're building here will likely take up to 10x longer to run."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "dsuQCg5Uaw1w",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "e846cd1f-9a61-4dee-9e0e-8b41961e17cf"
      },
      "source": [
        "# Check for GPU\n",
        "!nvidia-smi -L"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "GPU 0: Tesla P100-PCIE-16GB (UUID: GPU-87dd58c2-8f84-9bfd-8db0-31f9de58e303)\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "2MdzfDdzaQCb"
      },
      "source": [
        "## Get data\n",
        "\n",
        "Before we can start building a model, we've got to download the PubMed 200k RCT dataset.\n",
        "\n",
        "In a phenomenal act of kindness, the authors of the paper have made the data they used for their research availably publically and for free in the form of .txt files [on GitHub](https://github.com/Franck-Dernoncourt/pubmed-rct).\n",
        "\n",
        "We can copy them to our local directory using `git clone https://github.com/Franck-Dernoncourt/pubmed-rct`."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "c0qt0M55a98x",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "657a4b11-b691-4c06-e746-21cc63dfd501"
      },
      "source": [
        "!git clone https://github.com/Franck-Dernoncourt/pubmed-rct.git\n",
        "!ls pubmed-rct"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Cloning into 'pubmed-rct'...\n",
            "remote: Enumerating objects: 30, done.\u001b[K\n",
            "remote: Total 30 (delta 0), reused 0 (delta 0), pack-reused 30\u001b[K\n",
            "Unpacking objects: 100% (30/30), done.\n",
            "PubMed_200k_RCT\n",
            "PubMed_200k_RCT_numbers_replaced_with_at_sign\n",
            "PubMed_20k_RCT\n",
            "PubMed_20k_RCT_numbers_replaced_with_at_sign\n",
            "README.md\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Y3Oe1F6e7y0E"
      },
      "source": [
        "Checking the contents of the downloaded repository, you can see there are four folders.\n",
        "\n",
        "Each contains a different version of the PubMed 200k RCT dataset.\n",
        "\n",
        "Looking at the [README file](https://github.com/Franck-Dernoncourt/pubmed-rct) from the GitHub page, we get the following information:\n",
        "* PubMed 20k is a subset of PubMed 200k. I.e., any abstract present in PubMed 20k is also present in PubMed 200k.\n",
        "* `PubMed_200k_RCT` is the same as `PubMed_200k_RCT_numbers_replaced_with_at_sign`, except that in the latter all numbers had been replaced by `@`. (same for `PubMed_20k_RCT` vs. `PubMed_20k_RCT_numbers_replaced_with_at_sign`).\n",
        "* Since Github file size limit is 100 MiB, we had to compress `PubMed_200k_RCT\\train.7z` and `PubMed_200k_RCT_numbers_replaced_with_at_sign\\train.zip`. To uncompress `train.7z`, you may use 7-Zip on Windows, Keka on Mac OS X, or p7zip on Linux.\n",
        "\n",
        "To begin with, the dataset we're going to be focused on is `PubMed_20k_RCT_numbers_replaced_with_at_sign`.\n",
        "\n",
        "Why this one?\n",
        "\n",
        "Rather than working with the whole 200k dataset, we'll keep our experiments quick by starting with a smaller subset. We could've chosen the dataset with numbers instead of having them replaced with `@` but we didn't.\n",
        "\n",
        "Let's check the file contents. "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "crmxKEJ69bNW",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "0eddd253-cdd7-455f-fd89-07ee7a69bfc0"
      },
      "source": [
        "# Check what files are in the PubMed_20K dataset \n",
        "!ls pubmed-rct/PubMed_20k_RCT_numbers_replaced_with_at_sign"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "dev.txt  test.txt  train.txt\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "joApaTyD_DYL"
      },
      "source": [
        "Beautiful, looks like we've got three separate text files:\n",
        "* `train.txt` - training samples.\n",
        "* `dev.txt` - dev is short for development set, which is another name for validation set (in our case, we'll be using and referring to this file as our validation set).\n",
        "* `test.txt` - test samples.\n",
        "\n",
        "To save ourselves typing out the filepath to our target directory each time, let's turn it into a variable."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "C1Zp21fGbBUJ"
      },
      "source": [
        "# Start by using the 20k dataset\n",
        "data_dir = \"pubmed-rct/PubMed_20k_RCT_numbers_replaced_with_at_sign/\""
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "CWqMrjLCbFTr",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "3affce6a-48f5-4821-86fd-9ba15eed3191"
      },
      "source": [
        "# Check all of the filenames in the target directory\n",
        "import os\n",
        "filenames = [data_dir + filename for filename in os.listdir(data_dir)]\n",
        "filenames"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "['pubmed-rct/PubMed_20k_RCT_numbers_replaced_with_at_sign/dev.txt',\n",
              " 'pubmed-rct/PubMed_20k_RCT_numbers_replaced_with_at_sign/train.txt',\n",
              " 'pubmed-rct/PubMed_20k_RCT_numbers_replaced_with_at_sign/test.txt']"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 5
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BTjZ9NziaeKU"
      },
      "source": [
        "## Preprocess data\n",
        "\n",
        "Okay, now we've downloaded some text data, do you think we're ready to model it?\n",
        "\n",
        "Wait...\n",
        "\n",
        "We've downloaded the data but we haven't even looked at it yet.\n",
        "\n",
        "What's the motto for getting familiar with any new dataset?\n",
        "\n",
        "I'll give you a clue, the word begins with \"v\" and we say it three times.\n",
        "\n",
        "> Vibe, vibe, vibe?\n",
        "\n",
        "Sort of... we've definitely got to the feel the vibe of our data.\n",
        "\n",
        "> Values, values, values?\n",
        "\n",
        "Right again, we want to *see* lots of values but not quite what we're looking for.\n",
        "\n",
        "> Visualize, visualize, visualize?\n",
        "\n",
        "Boom! That's it. To get familiar and understand how we have to prepare our data for our deep learning models, we've got to visualize it.\n",
        "\n",
        "Because our data is in the form of text files, let's write some code to read each of the lines in a target file."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "2yjdhJxbbIhX"
      },
      "source": [
        "# Create function to read the lines of a document\n",
        "def get_lines(filename):\n",
        "  \"\"\"\n",
        "  Reads filename (a text file) and returns the lines of text as a list.\n",
        "  \n",
        "  Args:\n",
        "      filename: a string containing the target filepath to read.\n",
        "  \n",
        "  Returns:\n",
        "      A list of strings with one string per line from the target filename.\n",
        "      For example:\n",
        "      [\"this is the first line of filename\",\n",
        "       \"this is the second line of filename\",\n",
        "       \"...\"]\n",
        "  \"\"\"\n",
        "  with open(filename, \"r\") as f:\n",
        "    return f.readlines()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jpeOUfnkCNII"
      },
      "source": [
        "Alright, we've got a little function, `get_lines()` which takes the filepath of a text file, opens it, reads each of the lines and returns them.\n",
        "\n",
        "Let's try it out on the training data (`train.txt`)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "IT7RMQsEbI0I",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "8bf29102-5d66-4778-99ae-17b7ddbe2607"
      },
      "source": [
        "train_lines = get_lines(data_dir+\"train.txt\")\n",
        "train_lines[:20] # the whole first example of an abstract + a little more of the next one"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "['###24293578\\n',\n",
              " 'OBJECTIVE\\tTo investigate the efficacy of @ weeks of daily low-dose oral prednisolone in improving pain , mobility , and systemic low-grade inflammation in the short term and whether the effect would be sustained at @ weeks in older adults with moderate to severe knee osteoarthritis ( OA ) .\\n',\n",
              " 'METHODS\\tA total of @ patients with primary knee OA were randomized @:@ ; @ received @ mg/day of prednisolone and @ received placebo for @ weeks .\\n',\n",
              " 'METHODS\\tOutcome measures included pain reduction and improvement in function scores and systemic inflammation markers .\\n',\n",
              " 'METHODS\\tPain was assessed using the visual analog pain scale ( @-@ mm ) .\\n',\n",
              " 'METHODS\\tSecondary outcome measures included the Western Ontario and McMaster Universities Osteoarthritis Index scores , patient global assessment ( PGA ) of the severity of knee OA , and @-min walk distance ( @MWD ) .\\n',\n",
              " 'METHODS\\tSerum levels of interleukin @ ( IL-@ ) , IL-@ , tumor necrosis factor ( TNF ) - , and high-sensitivity C-reactive protein ( hsCRP ) were measured .\\n',\n",
              " 'RESULTS\\tThere was a clinically relevant reduction in the intervention group compared to the placebo group for knee pain , physical function , PGA , and @MWD at @ weeks .\\n',\n",
              " 'RESULTS\\tThe mean difference between treatment arms ( @ % CI ) was @ ( @-@ @ ) , p < @ ; @ ( @-@ @ ) , p < @ ; @ ( @-@ @ ) , p < @ ; and @ ( @-@ @ ) , p < @ , respectively .\\n',\n",
              " 'RESULTS\\tFurther , there was a clinically relevant reduction in the serum levels of IL-@ , IL-@ , TNF - , and hsCRP at @ weeks in the intervention group when compared to the placebo group .\\n',\n",
              " 'RESULTS\\tThese differences remained significant at @ weeks .\\n',\n",
              " 'RESULTS\\tThe Outcome Measures in Rheumatology Clinical Trials-Osteoarthritis Research Society International responder rate was @ % in the intervention group and @ % in the placebo group ( p < @ ) .\\n',\n",
              " 'CONCLUSIONS\\tLow-dose oral prednisolone had both a short-term and a longer sustained effect resulting in less knee pain , better physical function , and attenuation of systemic inflammation in older patients with knee OA ( ClinicalTrials.gov identifier NCT@ ) .\\n',\n",
              " '\\n',\n",
              " '###24854809\\n',\n",
              " 'BACKGROUND\\tEmotional eating is associated with overeating and the development of obesity .\\n',\n",
              " 'BACKGROUND\\tYet , empirical evidence for individual ( trait ) differences in emotional eating and cognitive mechanisms that contribute to eating during sad mood remain equivocal .\\n',\n",
              " 'OBJECTIVE\\tThe aim of this study was to test if attention bias for food moderates the effect of self-reported emotional eating during sad mood ( vs neutral mood ) on actual food intake .\\n',\n",
              " 'OBJECTIVE\\tIt was expected that emotional eating is predictive of elevated attention for food and higher food intake after an experimentally induced sad mood and that attentional maintenance on food predicts food intake during a sad versus a neutral mood .\\n',\n",
              " 'METHODS\\tParticipants ( N = @ ) were randomly assigned to one of the two experimental mood induction conditions ( sad/neutral ) .\\n']"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 7
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "j-IfwKVAbJAy"
      },
      "source": [
        "Reading the lines from the training text file results in a list of strings containing different abstract samples, the sentences in a sample along with the role the sentence plays in the abstract.\n",
        "\n",
        "The role of each sentence is prefixed at the start of each line separated by a tab (`\\t`) and each sentence finishes with a new line (`\\n`).\n",
        "\n",
        "Different abstracts are separated by abstract ID's (lines beginning with `###`) and newlines (`\\n`).\n",
        "\n",
        "Knowing this, it looks like we've got a couple of steps to do to get our samples ready to pass as training data to our future machine learning model.\n",
        "\n",
        "Let's write a function to perform the following steps:\n",
        "* Take a target file of abstract samples.\n",
        "* Read the lines in the target file.\n",
        "* For each line in the target file:  \n",
        "  * If the line begins with `###` mark it as an abstract ID and the beginning of a new abstract.\n",
        "    * Keep count of the number of lines in a sample.\n",
        "  * If the line begins with `\\n` mark it as the end of an abstract sample.\n",
        "    * Keep count of the total lines in a sample.\n",
        "  * Record the text before the `\\t` as the label of the line.\n",
        "  * Record the text after the `\\t` as the text of the line.\n",
        "* Return all of the lines in the target text file as a list of dictionaries containing the key/value pairs:\n",
        "  * `\"line_number\"` - the position of the line in the abstract (e.g. `3`).\n",
        "  * `\"target\"` - the role of the line in the abstract (e.g. `OBJECTIVE`).\n",
        "  * `\"text\"` - the text of the line in the abstract.\n",
        "  * `\"total_lines\"` - the total lines in an abstract sample (e.g. `14`).\n",
        "* Abstract ID's and newlines should be omitted from the returned preprocessed data.\n",
        "\n",
        "Example returned preprocessed sample (a single line from an abstract):\n",
        "\n",
        "```\n",
        "[{'line_number': 0,\n",
        "  'target': 'OBJECTIVE',\n",
        "  'text': 'to investigate the efficacy of @ weeks of daily low-dose oral prednisolone in improving pain , mobility , and systemic low-grade inflammation in the short term and whether the effect would be sustained at @ weeks in older adults with moderate to severe knee osteoarthritis ( oa ) .',\n",
        "  'total_lines': 11},\n",
        "  ...]\n",
        "```"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "B65Ffn9abJKH"
      },
      "source": [
        "def preprocess_text_with_line_numbers(filename):\n",
        "  \"\"\"Returns a list of dictionaries of abstract line data.\n",
        "\n",
        "  Takes in filename, reads its contents and sorts through each line,\n",
        "  extracting things like the target label, the text of the sentence,\n",
        "  how many sentences are in the current abstract and what sentence number\n",
        "  the target line is.\n",
        "\n",
        "  Args:\n",
        "      filename: a string of the target text file to read and extract line data\n",
        "      from.\n",
        "\n",
        "  Returns:\n",
        "      A list of dictionaries each containing a line from an abstract,\n",
        "      the lines label, the lines position in the abstract and the total number\n",
        "      of lines in the abstract where the line is from. For example:\n",
        "\n",
        "      [{\"target\": 'CONCLUSION',\n",
        "        \"text\": The study couldn't have gone better, turns out people are kinder than you think\",\n",
        "        \"line_number\": 8,\n",
        "        \"total_lines\": 8}]\n",
        "  \"\"\"\n",
        "  input_lines = get_lines(filename) # get all lines from filename\n",
        "  abstract_lines = \"\" # create an empty abstract\n",
        "  abstract_samples = [] # create an empty list of abstracts\n",
        "  \n",
        "  # Loop through each line in target file\n",
        "  for line in input_lines:\n",
        "    if line.startswith(\"###\"): # check to see if line is an ID line\n",
        "      abstract_id = line\n",
        "      abstract_lines = \"\" # reset abstract string\n",
        "    elif line.isspace(): # check to see if line is a new line\n",
        "      abstract_line_split = abstract_lines.splitlines() # split abstract into separate lines\n",
        "\n",
        "      # Iterate through each line in abstract and count them at the same time\n",
        "      for abstract_line_number, abstract_line in enumerate(abstract_line_split):\n",
        "        line_data = {} # create empty dict to store data from line\n",
        "        target_text_split = abstract_line.split(\"\\t\") # split target label from text\n",
        "        line_data[\"target\"] = target_text_split[0] # get target label\n",
        "        line_data[\"text\"] = target_text_split[1].lower() # get target text and lower it\n",
        "        line_data[\"line_number\"] = abstract_line_number # what number line does the line appear in the abstract?\n",
        "        line_data[\"total_lines\"] = len(abstract_line_split) - 1 # how many total lines are in the abstract? (start from 0)\n",
        "        abstract_samples.append(line_data) # add line data to abstract samples list\n",
        "    \n",
        "    else: # if the above conditions aren't fulfilled, the line contains a labelled sentence\n",
        "      abstract_lines += line\n",
        "  \n",
        "  return abstract_samples"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DwmUXHrigByo"
      },
      "source": [
        "Beautiful! That's one good looking function. Let's use it to preprocess each of our RCT 20k datasets."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "yDd28-PfgoUP",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "2869a548-fdb5-4e86-924e-d261cab7df14"
      },
      "source": [
        "# Get data from file and preprocess it\n",
        "%%time\n",
        "train_samples = preprocess_text_with_line_numbers(data_dir + \"train.txt\")\n",
        "val_samples = preprocess_text_with_line_numbers(data_dir + \"dev.txt\") # dev is another name for validation set\n",
        "test_samples = preprocess_text_with_line_numbers(data_dir + \"test.txt\")\n",
        "len(train_samples), len(val_samples), len(test_samples)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "CPU times: user 440 ms, sys: 87.8 ms, total: 528 ms\n",
            "Wall time: 532 ms\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vfFvPjTwgO7b"
      },
      "source": [
        "How do our training samples look?"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "FcYkHrnnh0lf",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "de7c7865-742d-467f-acc9-33cd33e752fb"
      },
      "source": [
        "# Check the first abstract of our training data\n",
        "train_samples[:14]"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "[{'line_number': 0,\n",
              "  'target': 'OBJECTIVE',\n",
              "  'text': 'to investigate the efficacy of @ weeks of daily low-dose oral prednisolone in improving pain , mobility , and systemic low-grade inflammation in the short term and whether the effect would be sustained at @ weeks in older adults with moderate to severe knee osteoarthritis ( oa ) .',\n",
              "  'total_lines': 11},\n",
              " {'line_number': 1,\n",
              "  'target': 'METHODS',\n",
              "  'text': 'a total of @ patients with primary knee oa were randomized @:@ ; @ received @ mg/day of prednisolone and @ received placebo for @ weeks .',\n",
              "  'total_lines': 11},\n",
              " {'line_number': 2,\n",
              "  'target': 'METHODS',\n",
              "  'text': 'outcome measures included pain reduction and improvement in function scores and systemic inflammation markers .',\n",
              "  'total_lines': 11},\n",
              " {'line_number': 3,\n",
              "  'target': 'METHODS',\n",
              "  'text': 'pain was assessed using the visual analog pain scale ( @-@ mm ) .',\n",
              "  'total_lines': 11},\n",
              " {'line_number': 4,\n",
              "  'target': 'METHODS',\n",
              "  'text': 'secondary outcome measures included the western ontario and mcmaster universities osteoarthritis index scores , patient global assessment ( pga ) of the severity of knee oa , and @-min walk distance ( @mwd ) .',\n",
              "  'total_lines': 11},\n",
              " {'line_number': 5,\n",
              "  'target': 'METHODS',\n",
              "  'text': 'serum levels of interleukin @ ( il-@ ) , il-@ , tumor necrosis factor ( tnf ) - , and high-sensitivity c-reactive protein ( hscrp ) were measured .',\n",
              "  'total_lines': 11},\n",
              " {'line_number': 6,\n",
              "  'target': 'RESULTS',\n",
              "  'text': 'there was a clinically relevant reduction in the intervention group compared to the placebo group for knee pain , physical function , pga , and @mwd at @ weeks .',\n",
              "  'total_lines': 11},\n",
              " {'line_number': 7,\n",
              "  'target': 'RESULTS',\n",
              "  'text': 'the mean difference between treatment arms ( @ % ci ) was @ ( @-@ @ ) , p < @ ; @ ( @-@ @ ) , p < @ ; @ ( @-@ @ ) , p < @ ; and @ ( @-@ @ ) , p < @ , respectively .',\n",
              "  'total_lines': 11},\n",
              " {'line_number': 8,\n",
              "  'target': 'RESULTS',\n",
              "  'text': 'further , there was a clinically relevant reduction in the serum levels of il-@ , il-@ , tnf - , and hscrp at @ weeks in the intervention group when compared to the placebo group .',\n",
              "  'total_lines': 11},\n",
              " {'line_number': 9,\n",
              "  'target': 'RESULTS',\n",
              "  'text': 'these differences remained significant at @ weeks .',\n",
              "  'total_lines': 11},\n",
              " {'line_number': 10,\n",
              "  'target': 'RESULTS',\n",
              "  'text': 'the outcome measures in rheumatology clinical trials-osteoarthritis research society international responder rate was @ % in the intervention group and @ % in the placebo group ( p < @ ) .',\n",
              "  'total_lines': 11},\n",
              " {'line_number': 11,\n",
              "  'target': 'CONCLUSIONS',\n",
              "  'text': 'low-dose oral prednisolone had both a short-term and a longer sustained effect resulting in less knee pain , better physical function , and attenuation of systemic inflammation in older patients with knee oa ( clinicaltrials.gov identifier nct@ ) .',\n",
              "  'total_lines': 11},\n",
              " {'line_number': 0,\n",
              "  'target': 'BACKGROUND',\n",
              "  'text': 'emotional eating is associated with overeating and the development of obesity .',\n",
              "  'total_lines': 10},\n",
              " {'line_number': 1,\n",
              "  'target': 'BACKGROUND',\n",
              "  'text': 'yet , empirical evidence for individual ( trait ) differences in emotional eating and cognitive mechanisms that contribute to eating during sad mood remain equivocal .',\n",
              "  'total_lines': 10}]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 10
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wzFwgxkQhzJS"
      },
      "source": [
        "Fantastic! Looks like our `preprocess_text_with_line_numbers()` function worked great. \n",
        "\n",
        "How about we turn our list of dictionaries into pandas DataFrame's so we visualize them better?"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "RRSTUXuth9jJ",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 483
        },
        "outputId": "2a8e45dc-96ab-4d4c-9e7c-e2a0de679282"
      },
      "source": [
        "import pandas as pd\n",
        "train_df = pd.DataFrame(train_samples)\n",
        "val_df = pd.DataFrame(val_samples)\n",
        "test_df = pd.DataFrame(test_samples)\n",
        "train_df.head(14)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>target</th>\n",
              "      <th>text</th>\n",
              "      <th>line_number</th>\n",
              "      <th>total_lines</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>OBJECTIVE</td>\n",
              "      <td>to investigate the efficacy of @ weeks of dail...</td>\n",
              "      <td>0</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>METHODS</td>\n",
              "      <td>a total of @ patients with primary knee oa wer...</td>\n",
              "      <td>1</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>METHODS</td>\n",
              "      <td>outcome measures included pain reduction and i...</td>\n",
              "      <td>2</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>METHODS</td>\n",
              "      <td>pain was assessed using the visual analog pain...</td>\n",
              "      <td>3</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>METHODS</td>\n",
              "      <td>secondary outcome measures included the wester...</td>\n",
              "      <td>4</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>METHODS</td>\n",
              "      <td>serum levels of interleukin @ ( il-@ ) , il-@ ...</td>\n",
              "      <td>5</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>there was a clinically relevant reduction in t...</td>\n",
              "      <td>6</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>7</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>the mean difference between treatment arms ( @...</td>\n",
              "      <td>7</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>further , there was a clinically relevant redu...</td>\n",
              "      <td>8</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>9</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>these differences remained significant at @ we...</td>\n",
              "      <td>9</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>10</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>the outcome measures in rheumatology clinical ...</td>\n",
              "      <td>10</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>11</th>\n",
              "      <td>CONCLUSIONS</td>\n",
              "      <td>low-dose oral prednisolone had both a short-te...</td>\n",
              "      <td>11</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>12</th>\n",
              "      <td>BACKGROUND</td>\n",
              "      <td>emotional eating is associated with overeating...</td>\n",
              "      <td>0</td>\n",
              "      <td>10</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>13</th>\n",
              "      <td>BACKGROUND</td>\n",
              "      <td>yet , empirical evidence for individual ( trai...</td>\n",
              "      <td>1</td>\n",
              "      <td>10</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "         target  ... total_lines\n",
              "0     OBJECTIVE  ...          11\n",
              "1       METHODS  ...          11\n",
              "2       METHODS  ...          11\n",
              "3       METHODS  ...          11\n",
              "4       METHODS  ...          11\n",
              "5       METHODS  ...          11\n",
              "6       RESULTS  ...          11\n",
              "7       RESULTS  ...          11\n",
              "8       RESULTS  ...          11\n",
              "9       RESULTS  ...          11\n",
              "10      RESULTS  ...          11\n",
              "11  CONCLUSIONS  ...          11\n",
              "12   BACKGROUND  ...          10\n",
              "13   BACKGROUND  ...          10\n",
              "\n",
              "[14 rows x 4 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 11
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BaVFf-qQg8xA"
      },
      "source": [
        "Now our data is in DataFrame form, we can perform some data analysis on it. "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "rnQIDiJPg231",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "98e48114-3ffc-481e-a5f2-8485a54e76e1"
      },
      "source": [
        "# Distribution of labels in training data\n",
        "train_df.target.value_counts()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "METHODS        59353\n",
              "RESULTS        57953\n",
              "CONCLUSIONS    27168\n",
              "BACKGROUND     21727\n",
              "OBJECTIVE      13839\n",
              "Name: target, dtype: int64"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 12
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HoZbOMqUhL2l"
      },
      "source": [
        "Looks like sentences with the `OBJECTIVE` label are the least common.\n",
        "\n",
        "How about we check the distribution of our abstract lengths?"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "tkCRIBWbhUmD",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 267
        },
        "outputId": "2c92b9fd-eec9-49c8-ec3c-c993b8e60196"
      },
      "source": [
        "train_df.total_lines.plot.hist();"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZEAAAD6CAYAAABgZXp6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAXpUlEQVR4nO3df7BfdX3n8efLRCpSkVDSLJNgg21Gl7r+gCvg1HatjCHg1tBdl4WtS5ZhiDNgV8f9QXQ6i8Uyk+5spdJatqlkTVwV8SfZEppGxHb7Bz+CIAjo5IqwJAJJDRDRFhZ97x/fz5Wv4ebyzbn53i/35vmY+c49530+55zPZ74TXpxzPt/vN1WFJEldvGjUHZAkzV6GiCSpM0NEktSZISJJ6swQkSR1ZohIkjobWogkeVWSO/tee5O8L8nRSbYm2d7+Lmjtk+TKJONJ7kpyYt+xVrX225Os6quflOTuts+VSTKs8UiSnisz8TmRJPOAncApwMXAnqpam2QNsKCqLklyJvC7wJmt3Uer6pQkRwPbgDGggNuBk6rqsSS3Av8BuAXYDFxZVTdM1Zdjjjmmli5dOpRxStJcdPvtt/99VS2cbNv8GerDacB3qurBJCuBt7T6BuBrwCXASmBj9VLt5iRHJTm2td1aVXsAkmwFViT5GnBkVd3c6huBs4ApQ2Tp0qVs27bt4I5OkuawJA/ub9tMPRM5B/hMW15UVQ+35UeARW15MfBQ3z47Wm2q+o5J6pKkGTL0EElyGPAO4HP7bmtXHUO/n5ZkdZJtSbbt3r172KeTpEPGTFyJnAF8vaoebeuPtttUtL+7Wn0ncFzffktabar6kknqz1FV66pqrKrGFi6c9LaeJKmDmQiRc3n2VhbAJmBihtUq4Lq++nltltapwBPtttcWYHmSBW0m13JgS9u2N8mpbVbWeX3HkiTNgKE+WE9yBPA24N195bXAtUkuAB4Ezm71zfRmZo0DPwLOB6iqPUk+DNzW2l028ZAduAj4BHA4vQfqUz5UlyQdXDMyxfeFZGxsrJydJUmDS3J7VY1Nts1PrEuSOjNEJEmdGSKSpM5m6hPrmqWWrrl+JOd9YO3bR3JeSQfGKxFJUmeGiCSpM0NEktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSps6GGSJKjknw+ybeS3JfkTUmOTrI1yfb2d0FrmyRXJhlPcleSE/uOs6q1355kVV/9pCR3t32uTJJhjkeS9LOGfSXyUeCvqurVwOuA+4A1wI1VtQy4sa0DnAEsa6/VwFUASY4GLgVOAU4GLp0Intbmwr79Vgx5PJKkPkMLkSQvB34DuBqgqp6uqseBlcCG1mwDcFZbXglsrJ6bgaOSHAucDmytqj1V9RiwFVjRth1ZVTdXVQEb+44lSZoBw7wSOR7YDfzPJHck+XiSI4BFVfVwa/MIsKgtLwYe6tt/R6tNVd8xSV2SNEOGGSLzgROBq6rqDcAPefbWFQDtCqKG2AcAkqxOsi3Jtt27dw/7dJJ0yBhmiOwAdlTVLW398/RC5dF2K4r2d1fbvhM4rm//Ja02VX3JJPXnqKp1VTVWVWMLFy6c1qAkSc8aWohU1SPAQ0le1UqnAfcCm4CJGVargOva8ibgvDZL61TgiXbbawuwPMmC9kB9ObClbdub5NQ2K+u8vmNJkmbA/CEf/3eBTyU5DLgfOJ9ecF2b5ALgQeDs1nYzcCYwDvyotaWq9iT5MHBba3dZVe1pyxcBnwAOB25oL0nSDBlqiFTVncDYJJtOm6RtARfv5zjrgfWT1LcBr5lmNyVJHfmJdUlSZ4aIJKkzQ0SS1JkhIknqzBCRJHVmiEiSOjNEJEmdGSKSpM4MEUlSZ4aIJKkzQ0SS1JkhIknqzBCRJHVmiEiSOjNEJEmdGSKSpM4MEUlSZ4aIJKkzQ0SS1JkhIknqzBCRJHVmiEiSOhtqiCR5IMndSe5Msq3Vjk6yNcn29ndBqyfJlUnGk9yV5MS+46xq7bcnWdVXP6kdf7ztm2GOR5L0s2biSuQ3q+r1VTXW1tcAN1bVMuDGtg5wBrCsvVYDV0EvdIBLgVOAk4FLJ4Kntbmwb78Vwx+OJGnCKG5nrQQ2tOUNwFl99Y3VczNwVJJjgdOBrVW1p6oeA7YCK9q2I6vq5qoqYGPfsSRJM2DYIVLAXye5PcnqVltUVQ+35UeARW15MfBQ3747Wm2q+o5J6s+RZHWSbUm27d69ezrjkST1mT/k47+5qnYm+UVga5Jv9W+sqkpSQ+4DVbUOWAcwNjY29PNJ0qFiqFciVbWz/d0FfIneM41H260o2t9drflO4Li+3Ze02lT1JZPUJUkzZGghkuSIJC+bWAaWA98ENgETM6xWAde15U3AeW2W1qnAE+221xZgeZIF7YH6cmBL27Y3yaltVtZ5fceSJM2AYd7OWgR8qc26nQ98uqr+KsltwLVJLgAeBM5u7TcDZwLjwI+A8wGqak+SDwO3tXaXVdWetnwR8AngcOCG9pIkzZChhUhV3Q+8bpL694HTJqkXcPF+jrUeWD9JfRvwmml3VpLUiZ9YlyR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktTZQCGS5J8NuyOSpNln0CuRP0tya5KLkrx8qD2SJM0aA4VIVf068DvAccDtST6d5G1D7Zkk6QVv4GciVbUd+D3gEuCfA1cm+VaSfzmszkmSXtgGfSby2iRXAPcBbwV+q6r+aVu+Yoj9kyS9gM0fsN2fAB8HPlhV/zBRrKrvJfm9ofRMkvSCN+jtrLcDn54IkCQvSvJSgKr65FQ7JpmX5I4kf9nWj09yS5LxJJ9Nclir/1xbH2/bl/Yd4wOt/u0kp/fVV7TaeJI1BzJwSdL0DRoiXwEO71t/aasN4r30boNN+EPgiqr6FeAx4IJWvwB4rNWvaO1IcgJwDvCrwAp6M8XmJZkHfAw4AzgBOLe1lSTNkEFvZ72kqp6cWKmqJyeuRKaSZAm9q5jLgfcnCb3nKP+2NdkAfAi4CljZlgE+D/xpa78SuKaqngK+m2QcOLm1G6+q+9u5rmlt7x1wTHoBW7rm+pGd+4G1bx/ZuaXZZtArkR8mOXFiJclJwD9M0X7CHwP/BfhJW/8F4PGqeqat7wAWt+XFwEMAbfsTrf1P6/vss7+6JGmGDHol8j7gc0m+BwT4J8C/mWqHJP8C2FVVtyd5y7R6OU1JVgOrAV7xileMsiuSNKcMFCJVdVuSVwOvaqVvV9X/e57dfg14R5IzgZcARwIfBY5KMr9dbSwBdrb2O+l9mHFHkvnAy4Hv99Un9O+zv/q+/V8HrAMYGxur5+m3JGlAB/IFjG8EXgucSO8h9nlTNa6qD1TVkqpaSu/B+Fer6neAm4B3tmargOva8qa2Ttv+1aqqVj+nzd46HlgG3ArcBixrs70Oa+fYdADjkSRN00BXIkk+CfwycCfw41YuYGOHc14CXJPkD4A7gKtb/Wrgk+3B+R56oUBV3ZPkWnoPzJ8BLq6qH7d+vQfYAswD1lfVPR36I0nqaNBnImPACe3K4IBV1deAr7Xl+3l2dlV/m38E/vV+9r+c3gyvfeubgc1d+iRJmr5Bb2d9k97DdEmSfmrQK5FjgHuT3Ao8NVGsqncMpVeSpFlh0BD50DA7IUmanQad4vs3SX4JWFZVX2mfVp833K5Jkl7oBv0q+AvpfRXJn7fSYuDLw+qUJGl2GPTB+sX0Pjy4F376A1W/OKxOSZJmh0FD5KmqenpipX2i3E9+S9IhbtAQ+ZskHwQOb7+t/jngfw+vW5Kk2WDQEFkD7AbuBt5N7wN+/qKhJB3iBp2d9RPgL9pLkiRg8O/O+i6TPAOpqlce9B5JkmaNA/nurAkvofcdV0cf/O5IkmaTgZ6JVNX3+147q+qP6f3srSTpEDbo7awT+1ZfRO/KZNCrGEnSHDVoEPxR3/IzwAPA2Qe9N5KkWWXQ2Vm/OeyOSJJmn0FvZ71/qu1V9ZGD0x1J0mxyILOz3sizv2H+W/R+53z7MDoljdLSNdeP5LwPrHWuimafQUNkCXBiVf0AIMmHgOur6l3D6pgk6YVv0K89WQQ83bf+dKtJkg5hg16JbARuTfKltn4WsGE4XZIkzRaDzs66PMkNwK+30vlVdcfwuiVJmg0GvZ0F8FJgb1V9FNiR5PipGid5SZJbk3wjyT1Jfr/Vj09yS5LxJJ9Nclir/1xbH2/bl/Yd6wOt/u0kp/fVV7TaeJI1BzAWSdJBMOjP414KXAJ8oJVeDPyv59ntKeCtVfU64PXAiiSnAn8IXFFVvwI8BlzQ2l8APNbqV7R2JDkBOAf4VWAF8GdJ5iWZB3wMOAM4ATi3tZUkzZBBr0R+G3gH8EOAqvoe8LKpdqieJ9vqi9urgLfS+7126D1XOastr+TZ5yyfB05Lkla/pqqeqqrvAuPAye01XlX3t19dvKa1lSTNkEFD5OmqKtrXwSc5YpCd2hXDncAuYCvwHeDxqnqmNdkBLG7Li4GHANr2J4Bf6K/vs8/+6pKkGTJoiFyb5M+Bo5JcCHyFAX6gqqp+XFWvp/c5k5OBV3fu6TQkWZ1kW5Jtu3fvHkUXJGlOet7ZWe2W0mfpBcBe4FXAf62qrYOepKoeT3IT8CZ6QTS/XW0sAXa2ZjuB4+g9tJ8PvBz4fl99Qv8++6vve/51wDqAsbGx5/y4liSpm+e9Emm3sTZX1daq+s9V9Z8GCZAkC5Mc1ZYPB94G3AfcBLyzNVsFXNeWN7V12vavtnNvAs5ps7eOB5bR+8qV24BlbbbXYfQevk98LYskaQYM+mHDryd5Y1XddgDHPhbY0GZRvQi4tqr+Msm9wDVJ/gC4A7i6tb8a+GSScWAPvVCgqu5Jci1wL72vob+4qn4MkOQ9wBZgHrC+qu45gP5JkqZp0BA5BXhXkgfozdAKvYuU1+5vh6q6C3jDJPX76T0f2bf+j/R+dneyY10OXD5JfTOwebAhSJIOtilDJMkrqur/AqdP1U6SdGh6viuRL9P79t4Hk3yhqv7VTHRKkjQ7PN+D9fQtv3KYHZEkzT7PFyK1n2VJkp73dtbrkuyld0VyeFuGZx+sHznU3kmSXtCmDJGqmjdTHZEkzT4H8lXwkiT9DENEktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktTZoD9KpRFauub6UXdBkibllYgkqTNDRJLUmSEiSerMEJEkdWaISJI6G1qIJDkuyU1J7k1yT5L3tvrRSbYm2d7+Lmj1JLkyyXiSu5Kc2HesVa399iSr+uonJbm77XNlkjy3J5KkYRnmlcgzwH+sqhOAU4GLk5wArAFurKplwI1tHeAMYFl7rQaugl7oAJcCpwAnA5dOBE9rc2HffiuGOB5J0j6GFiJV9XBVfb0t/wC4D1gMrAQ2tGYbgLPa8kpgY/XcDByV5FjgdGBrVe2pqseArcCKtu3Iqrq5qgrY2HcsSdIMmJFnIkmWAm8AbgEWVdXDbdMjwKK2vBh4qG+3Ha02VX3HJPXJzr86ybYk23bv3j2tsUiSnjX0EEny88AXgPdV1d7+be0Koobdh6paV1VjVTW2cOHCYZ9Okg4ZQw2RJC+mFyCfqqovtvKj7VYU7e+uVt8JHNe3+5JWm6q+ZJK6JGmGDHN2VoCrgfuq6iN9mzYBEzOsVgHX9dXPa7O0TgWeaLe9tgDLkyxoD9SXA1vatr1JTm3nOq/vWJKkGTDML2D8NeDfAXcnubPVPgisBa5NcgHwIHB227YZOBMYB34EnA9QVXuSfBi4rbW7rKr2tOWLgE8AhwM3tJckaYYMLUSq6u+A/X1u47RJ2hdw8X6OtR5YP0l9G/CaaXRTkjQNfmJdktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktSZISJJ6swQkSR1ZohIkjozRCRJnQ0tRJKsT7IryTf7akcn2Zpke/u7oNWT5Mok40nuSnJi3z6rWvvtSVb11U9Kcnfb58okGdZYJEmTmz/EY38C+FNgY19tDXBjVa1NsqatXwKcASxrr1OAq4BTkhwNXAqMAQXcnmRTVT3W2lwI3AJsBlYANwxxPNJQLV1z/UjO+8Dat4/kvJobhnYlUlV/C+zZp7wS2NCWNwBn9dU3Vs/NwFFJjgVOB7ZW1Z4WHFuBFW3bkVV1c1UVvaA6C0nSjJrpZyKLqurhtvwIsKgtLwYe6mu3o9Wmqu+YpC5JmkEje7DeriBqJs6VZHWSbUm27d69eyZOKUmHhJkOkUfbrSja312tvhM4rq/dklabqr5kkvqkqmpdVY1V1djChQunPQhJUs9Mh8gmYGKG1Srgur76eW2W1qnAE+221xZgeZIFbSbXcmBL27Y3yaltVtZ5fceSJM2Qoc3OSvIZ4C3AMUl20JtltRa4NskFwIPA2a35ZuBMYBz4EXA+QFXtSfJh4LbW7rKqmnhYfxG9GWCH05uV5cwsSZphQwuRqjp3P5tOm6RtARfv5zjrgfWT1LcBr5lOHyVJ0+Mn1iVJnRkikqTODBFJUmeGiCSpM0NEktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSps/mj7oCk0Vq65vqRnfuBtW8f2bl1cHglIknqbNZfiSRZAXwUmAd8vKrWDutco/w/NmkuGtW/Ka+ADp5ZfSWSZB7wMeAM4ATg3CQnjLZXknTomNUhApwMjFfV/VX1NHANsHLEfZKkQ8Zsv521GHiob30HcMqI+iJplnAywcEz20NkIElWA6vb6pNJvj3K/kziGODvR92JIZvrY3R8s9+MjDF/OOwz7Nd0xvdL+9sw20NkJ3Bc3/qSVvsZVbUOWDdTnTpQSbZV1dio+zFMc32Mjm/2m+tjHNb4ZvszkduAZUmOT3IYcA6wacR9kqRDxqy+EqmqZ5K8B9hCb4rv+qq6Z8TdkqRDxqwOEYCq2gxsHnU/pukFe6vtIJrrY3R8s99cH+NQxpeqGsZxJUmHgNn+TESSNEKGyIgleSDJ3UnuTLJt1P05GJKsT7IryTf7akcn2Zpke/u7YJR9nI79jO9DSXa29/HOJGeOso/TkeS4JDcluTfJPUne2+pz4j2cYnxz6T18SZJbk3yjjfH3W/34JLckGU/y2TYhaXrn8nbWaCV5ABirqjkzBz/JbwBPAhur6jWt9t+APVW1NskaYEFVXTLKfna1n/F9CHiyqv77KPt2MCQ5Fji2qr6e5GXA7cBZwL9nDryHU4zvbObOexjgiKp6MsmLgb8D3gu8H/hiVV2T5H8A36iqq6ZzLq9EdNBV1d8Ce/YprwQ2tOUN9P7Rzkr7Gd+cUVUPV9XX2/IPgPvofTvEnHgPpxjfnFE9T7bVF7dXAW8FPt/qB+U9NERGr4C/TnJ7+2T9XLWoqh5uy48Ai0bZmSF5T5K72u2uWXmrZ19JlgJvAG5hDr6H+4wP5tB7mGRekjuBXcBW4DvA41X1TGuyg4MQnobI6L25qk6k903EF7dbJXNa9e6hzrX7qFcBvwy8HngY+KPRdmf6kvw88AXgfVW1t3/bXHgPJxnfnHoPq+rHVfV6et/kcTLw6mGcxxAZsara2f7uAr5E782eix5t96In7knvGnF/DqqqerT9o/0J8BfM8vex3Uf/AvCpqvpiK8+Z93Cy8c2193BCVT0O3AS8CTgqycTnAyf9mqgDZYiMUJIj2oM9khwBLAe+OfVes9YmYFVbXgVcN8K+HHQT/3FtfptZ/D62h7JXA/dV1Uf6Ns2J93B/45tj7+HCJEe15cOBt9F79nMT8M7W7KC8h87OGqEkr6R39QG9bw/4dFVdPsIuHRRJPgO8hd63hj4KXAp8GbgWeAXwIHB2Vc3Kh9P7Gd9b6N0GKeAB4N19zw9mlSRvBv4PcDfwk1b+IL3nBrP+PZxifOcyd97D19J7cD6P3sXCtVV1WftvzjXA0cAdwLuq6qlpncsQkSR15e0sSVJnhogkqTNDRJLUmSEiSerMEJEkdWaISJI6M0QkSZ0ZIpKkzv4/2LyLCkd/AwYAAAAASUVORK5CYII=\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": [],
            "needs_background": "light"
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qt2kPnlNhy0L"
      },
      "source": [
        "Okay, looks like most of the abstracts are around 7 to 15 sentences in length.\n",
        "\n",
        "It's good to check these things out to make sure when we do train a model or test it on unseen samples, our results aren't outlandish."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Eqps0Jw0wcQo"
      },
      "source": [
        "### Get lists of sentences\n",
        "\n",
        "When we build our deep learning model, one of its main inputs will be a list of strings (the lines of an abstract).\n",
        "\n",
        "We can get these easily from our DataFrames by calling the `tolist()` method on our `\"text\"` columns."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ybvBrdPKwmDR",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "4d305424-a2a9-4f06-f6db-4f8d89ff95b6"
      },
      "source": [
        "# Convert abstract text lines into lists \n",
        "train_sentences = train_df[\"text\"].tolist()\n",
        "val_sentences = val_df[\"text\"].tolist()\n",
        "test_sentences = test_df[\"text\"].tolist()\n",
        "len(train_sentences), len(val_sentences), len(test_sentences)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(180040, 30212, 30135)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 14
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "M-OPWZPei46_",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "c298808c-5618-4fe6-9fa5-03e0ad553c1b"
      },
      "source": [
        "# View first 10 lines of training sentences\n",
        "train_sentences[:10]"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "['to investigate the efficacy of @ weeks of daily low-dose oral prednisolone in improving pain , mobility , and systemic low-grade inflammation in the short term and whether the effect would be sustained at @ weeks in older adults with moderate to severe knee osteoarthritis ( oa ) .',\n",
              " 'a total of @ patients with primary knee oa were randomized @:@ ; @ received @ mg/day of prednisolone and @ received placebo for @ weeks .',\n",
              " 'outcome measures included pain reduction and improvement in function scores and systemic inflammation markers .',\n",
              " 'pain was assessed using the visual analog pain scale ( @-@ mm ) .',\n",
              " 'secondary outcome measures included the western ontario and mcmaster universities osteoarthritis index scores , patient global assessment ( pga ) of the severity of knee oa , and @-min walk distance ( @mwd ) .',\n",
              " 'serum levels of interleukin @ ( il-@ ) , il-@ , tumor necrosis factor ( tnf ) - , and high-sensitivity c-reactive protein ( hscrp ) were measured .',\n",
              " 'there was a clinically relevant reduction in the intervention group compared to the placebo group for knee pain , physical function , pga , and @mwd at @ weeks .',\n",
              " 'the mean difference between treatment arms ( @ % ci ) was @ ( @-@ @ ) , p < @ ; @ ( @-@ @ ) , p < @ ; @ ( @-@ @ ) , p < @ ; and @ ( @-@ @ ) , p < @ , respectively .',\n",
              " 'further , there was a clinically relevant reduction in the serum levels of il-@ , il-@ , tnf - , and hscrp at @ weeks in the intervention group when compared to the placebo group .',\n",
              " 'these differences remained significant at @ weeks .']"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 15
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "r36Ldgy2jDR6"
      },
      "source": [
        "Alright, we've separated our text samples. As you might've guessed, we'll have to write code to convert the text to numbers before we can use it with our machine learning models, we'll get to this soon."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "rk1tXXANaxhK"
      },
      "source": [
        "## Make numeric labels (ML models require numeric labels)\n",
        "\n",
        "We're going to create one hot and label encoded labels.\n",
        "\n",
        "We could get away with just making label encoded labels, however, TensorFlow's CategoricalCrossentropy loss function likes to have one hot encoded labels (this will enable us to use label smoothing later on).\n",
        "\n",
        "To numerically encode labels we'll use Scikit-Learn's [`OneHotEncoder`](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html) and [`LabelEncoder`](http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html) classes."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "riWJb105awwn",
        "outputId": "906c0599-7257-401d-d279-185c8e5c6f9b"
      },
      "source": [
        "# One hot encode labels\n",
        "from sklearn.preprocessing import OneHotEncoder\n",
        "one_hot_encoder = OneHotEncoder(sparse=False)\n",
        "train_labels_one_hot = one_hot_encoder.fit_transform(train_df[\"target\"].to_numpy().reshape(-1, 1))\n",
        "val_labels_one_hot = one_hot_encoder.transform(val_df[\"target\"].to_numpy().reshape(-1, 1))\n",
        "test_labels_one_hot = one_hot_encoder.transform(test_df[\"target\"].to_numpy().reshape(-1, 1))\n",
        "\n",
        "# Check what training labels look like\n",
        "train_labels_one_hot"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "array([[0., 0., 0., 1., 0.],\n",
              "       [0., 0., 1., 0., 0.],\n",
              "       [0., 0., 1., 0., 0.],\n",
              "       ...,\n",
              "       [0., 0., 0., 0., 1.],\n",
              "       [0., 1., 0., 0., 0.],\n",
              "       [0., 1., 0., 0., 0.]])"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 16
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bG-iZttkkCjL"
      },
      "source": [
        "### Label encode labels"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "IG8LmKhAozc_",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "45fca042-df57-40e5-c7a0-d3ba7ed81f56"
      },
      "source": [
        "# Extract labels (\"target\" columns) and encode them into integers \n",
        "from sklearn.preprocessing import LabelEncoder\n",
        "label_encoder = LabelEncoder()\n",
        "train_labels_encoded = label_encoder.fit_transform(train_df[\"target\"].to_numpy())\n",
        "val_labels_encoded = label_encoder.transform(val_df[\"target\"].to_numpy())\n",
        "test_labels_encoded = label_encoder.transform(test_df[\"target\"].to_numpy())\n",
        "\n",
        "# Check what training labels look like\n",
        "train_labels_encoded"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "array([3, 2, 2, ..., 4, 1, 1])"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 17
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "rd-uax-AkExg"
      },
      "source": [
        "Now we've trained an instance of `LabelEncoder`, we can get the class names and number of classes using the `classes_` attribute."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "KeQ1OQ9glVaz",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "345b402e-6c44-423b-d7a8-950903753f00"
      },
      "source": [
        "# Get class names and number of classes from LabelEncoder instance \n",
        "num_classes = len(label_encoder.classes_)\n",
        "class_names = label_encoder.classes_\n",
        "num_classes, class_names"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(5, array(['BACKGROUND', 'CONCLUSIONS', 'METHODS', 'OBJECTIVE', 'RESULTS'],\n",
              "       dtype=object))"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 18
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "gSGeXjbmlJar"
      },
      "source": [
        "## Creating a series of model experiments\n",
        "\n",
        "We've proprocessed our data so now, in true machine learning fashion, it's time to setup a series of modelling experiments.\n",
        "\n",
        "We'll start by creating a simple baseline model to obtain a score we'll try to beat by building more and more complex models as we move towards replicating the sequence model outlined in [*Neural networks for joint sentence\n",
        "classification in medical paper abstracts*](https://arxiv.org/pdf/1612.05251.pdf).\n",
        "\n",
        "For each model, we'll train it on the training data and evaluate it on the validation data."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dJD7X7atahFC"
      },
      "source": [
        "## Model 0: Getting a baseline \n",
        "\n",
        "Our first model we'll be a TF-IDF Multinomial Naive Bayes as recommended by [Scikit-Learn's machine learning map](https://scikit-learn.org/stable/tutorial/machine_learning_map/index.html).\n",
        "\n",
        "To build it, we'll create a Scikit-Learn `Pipeline` which uses the [`TfidfVectorizer`](https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html) class to convert our abstract sentences to numbers using the TF-IDF (term frequency-inverse document frequecy) algorithm and then learns to classify our sentences using the [`MultinomialNB`](https://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.MultinomialNB.html) aglorithm."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Km5hWlVymnxv"
      },
      "source": [
        "from sklearn.feature_extraction.text import TfidfVectorizer\n",
        "from sklearn.naive_bayes import MultinomialNB\n",
        "from sklearn.pipeline import Pipeline\n",
        "\n",
        "# Create a pipeline\n",
        "model_0 = Pipeline([\n",
        "  (\"tf-idf\", TfidfVectorizer()),\n",
        "  (\"clf\", MultinomialNB())\n",
        "])\n",
        "\n",
        "# Fit the pipeline to the training data\n",
        "model_0.fit(X=train_sentences, \n",
        "            y=train_labels_encoded);"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "GGUtAzKem-dO"
      },
      "source": [
        "Due to the speed of the Multinomial Naive Bayes algorithm, it trains very quickly.\n",
        "\n",
        "We can evaluate our model's accuracy on the validation dataset using the `score()` method."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "kq7BAPCmn1bM",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "ddccf385-7759-443e-8745-7a55228b710a"
      },
      "source": [
        "# Evaluate baseline on validation dataset\n",
        "model_0.score(X=val_sentences,\n",
        "              y=val_labels_encoded)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "0.7218323844829869"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 20
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Mp0aq6XpnPCG"
      },
      "source": [
        "Nice! Looks like 72.1% accuracy will be the number to beat with our deeper models.\n",
        "\n",
        "Now let's make some predictions with our baseline model to further evaluate it."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "vuGl9z2NjAl8",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "0c70de66-d72a-4d86-938e-591cb9723365"
      },
      "source": [
        "# Make predictions\n",
        "baseline_preds = model_0.predict(val_sentences)\n",
        "baseline_preds"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "array([4, 1, 3, ..., 4, 4, 1])"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 21
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jh2K8p3sndlG"
      },
      "source": [
        "To evaluate our baseline's predictions, we'll import the `calculate_results()` function we created in the [previous notebook](https://github.com/mrdbourke/tensorflow-deep-learning/blob/main/08_introduction_to_nlp_in_tensorflow.ipynb) and added it to our [`helper_functions.py` script](https://github.com/mrdbourke/tensorflow-deep-learning/blob/main/extras/helper_functions.py) to compare them to the ground truth labels.\n",
        "\n",
        "More specificially the `calculate_results()` function will help us obtain the following:\n",
        "* Accuracy\n",
        "* Precision\n",
        "* Recall\n",
        "* F1-score"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "V5GaqHjtHWUM"
      },
      "source": [
        "### Download helper functions script\n",
        "\n",
        "Let's get our `helper_functions.py` script we've been using to store helper functions we've created in previous notebooks."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "a6y-nK2tGwOL",
        "outputId": "08dfb08a-2fc7-4057-bac3-218a07529ef9"
      },
      "source": [
        "# Download helper functions script\n",
        "!wget https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/helper_functions.py"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "--2021-04-16 05:17:09--  https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/helper_functions.py\n",
            "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n",
            "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n",
            "HTTP request sent, awaiting response... 200 OK\n",
            "Length: 10246 (10K) [text/plain]\n",
            "Saving to: ‘helper_functions.py’\n",
            "\n",
            "\rhelper_functions.py   0%[                    ]       0  --.-KB/s               \rhelper_functions.py 100%[===================>]  10.01K  --.-KB/s    in 0s      \n",
            "\n",
            "2021-04-16 05:17:10 (113 MB/s) - ‘helper_functions.py’ saved [10246/10246]\n",
            "\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nmXBYc5SHitH"
      },
      "source": [
        "Now we've got the helper functions script we can import the `caculate_results()` function and see how our baseline model went."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "P44NMOt1GzZL"
      },
      "source": [
        "# Import calculate_results helper function\n",
        "from helper_functions import calculate_results"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "9WN_TLx2jv7T",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "f1ebc8f7-57cc-4350-814e-fc6f1e3d5c6b"
      },
      "source": [
        "# Calculate baseline results\n",
        "baseline_results = calculate_results(y_true=val_labels_encoded,\n",
        "                                     y_pred=baseline_preds)\n",
        "baseline_results"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "{'accuracy': 72.1832384482987,\n",
              " 'f1': 0.6989250353450294,\n",
              " 'precision': 0.7186466952323352,\n",
              " 'recall': 0.7218323844829869}"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 24
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MADIlN1QaiTW"
      },
      "source": [
        "## Preparing our data for deep sequence models\n",
        "\n",
        "Excellent! We've got a working baseline to try and improve upon.\n",
        "\n",
        "But before we start building deeper models, we've got to create vectorization and embedding layers.\n",
        "\n",
        "The vectorization layer will convert our text to numbers and the embedding layer will capture the relationships between those numbers.\n",
        "\n",
        "To start creating our vectorization and embedding layers, we'll need to import the appropriate libraries (namely TensorFlow and NumPy)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "vCR0F7Rhptcp"
      },
      "source": [
        "import numpy as np\n",
        "import tensorflow as tf\n",
        "from tensorflow.keras import layers"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JTEPCjOuUNdj"
      },
      "source": [
        "Since we'll be turning our sentences into numbers, it's a good idea to figure out how many words are in each sentence.\n",
        "\n",
        "When our model goes through our sentences, it works best when they're all the same length (this is important for creating batches of the same size tensors).\n",
        "\n",
        "For example, if one sentence is eight words long and another is 29 words long, we want to pad the eight word sentence with zeros so it ends up being the same length as the 29 word sentence.\n",
        "\n",
        "Let's write some code to find the average length of sentences in the training set."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "1Y-V_9-KrH7y",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "abab0d65-ca86-460e-8a57-24e3b9e0bf6f"
      },
      "source": [
        "# How long is each sentence on average?\n",
        "sent_lens = [len(sentence.split()) for sentence in train_sentences]\n",
        "avg_sent_len = np.mean(sent_lens)\n",
        "avg_sent_len # return average sentence length (in tokens)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "26.338269273494777"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 41
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oToFcpVTU6fU"
      },
      "source": [
        "How about the distribution of sentence lengths?"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Y9S27ACkroai",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 265
        },
        "outputId": "39c3d437-0543-4687-ae76-ed73866f21b6"
      },
      "source": [
        "# What's the distribution look like?\n",
        "import matplotlib.pyplot as plt\n",
        "plt.hist(sent_lens, bins=7);"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYkAAAD4CAYAAAAZ1BptAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAXsElEQVR4nO3df6xf9X3f8edrdiC/GgzhjjEbZqdxWzmobYgFrtJVadyCIVXNJBIZbcPLrFhroEundolppNElQYKuKysSoaKxh4kiDKPpsBYz1wOiaNIMmEAAQwi3QIItwA420C4K1Ml7f3w/Tr673I+vfa+519c8H9JX95z353PO+Xw4l/vyOd9z7zdVhSRJ4/kHMz0ASdKxy5CQJHUZEpKkLkNCktRlSEiSuubO9ACOtlNPPbUWLlw408OQpFnlgQce+H5VjYytH3chsXDhQnbs2DHTw5CkWSXJd8ere7tJktRlSEiSugwJSVKXISFJ6jIkJEldhoQkqWvCkEiyIcmeJI+Oqf9ukm8n2Znkj4fqVyQZTfJEkvOH6itabTTJuqH6oiT3tvqtSU5o9RPb+mhrX3g0JixJOnyHcyVxE7BiuJDk14GVwC9V1fuAP2n1JcAq4H1tmy8mmZNkDnA9cAGwBLik9QW4Bri2qt4L7AfWtPoaYH+rX9v6SZKm0YQhUVXfAPaNKf8OcHVVvdr67Gn1lcCmqnq1qp4GRoFz2mu0qp6qqteATcDKJAE+DNzett8IXDS0r41t+XZgeesvSZomk/2N658D/mmSq4AfAn9QVfcD84HtQ/12tRrAs2Pq5wLvBl6qqgPj9J9/cJuqOpDk5db/+2MHk2QtsBbgzDPPnOSUYOG6r01625nwzNUfmekhSDrOTfaN67nAKcAy4N8Dt83kv/Kr6saqWlpVS0dGXvenRyRJkzTZkNgFfLUG7gN+DJwK7AbOGOq3oNV69ReBeUnmjqkzvE1rP6n1lyRNk8mGxH8Hfh0gyc8BJzC4DbQZWNWeTFoELAbuA+4HFrcnmU5g8Ob25hp8wPY9wMVtv6uBO9ry5rZOa7+7/EBuSZpWE74nkeQW4EPAqUl2AVcCG4AN7bHY14DV7Qf4ziS3AY8BB4DLqupHbT+XA1uBOcCGqtrZDvEZYFOSLwAPAutbfT3w5SSjDN44X3UU5itJOgIThkRVXdJp+hed/lcBV41T3wJsGaf+FIOnn8bWfwh8dKLxSZLeOP7GtSSpy5CQJHUZEpKkLkNCktRlSEiSugwJSVKXISFJ6jIkJEldhoQkqcuQkCR1GRKSpC5DQpLUZUhIkroMCUlSlyEhSeoyJCRJXROGRJINSfa0T6Eb2/b7SSrJqW09Sa5LMprk4SRnD/VdneTJ9lo9VP9AkkfaNtclSaufkmRb678tyclHZ8qSpMN1OFcSNwErxhaTnAGcB3xvqHwBg8+1XgysBW5ofU9h8LGn5zL4FLorh37o3wB8Ymi7g8daB9xVVYuBu9q6JGkaTRgSVfUNBp8xPda1wKeBGqqtBG6uge3AvCSnA+cD26pqX1XtB7YBK1rbu6pqe/uM7JuBi4b2tbEtbxyqS5KmyaTek0iyEthdVd8a0zQfeHZofVerHaq+a5w6wGlV9Vxbfh44bTJjlSRN3twj3SDJ24E/ZHCraVpUVSWpXnuStQxub3HmmWdO17Ak6bg3mSuJnwUWAd9K8gywAPhmkn8E7AbOGOq7oNUOVV8wTh3ghXY7ivZ1T29AVXVjVS2tqqUjIyOTmJIkaTxHHBJV9UhV/cOqWlhVCxncIjq7qp4HNgOXtqeclgEvt1tGW4Hzkpzc3rA+D9ja2l5Jsqw91XQpcEc71Gbg4FNQq4fqkqRpcjiPwN4C/B/g55PsSrLmEN23AE8Bo8BfAJ8EqKp9wOeB+9vrc61G6/Olts3fAHe2+tXAbyZ5EviNti5JmkYTvidRVZdM0L5waLmAyzr9NgAbxqnvAM4ap/4isHyi8UmS3jj+xrUkqcuQkCR1GRKSpC5DQpLUZUhIkroMCUlSlyEhSeoyJCRJXYaEJKnLkJAkdRkSkqQuQ0KS1GVISJK6DAlJUpchIUnqMiQkSV2GhCSp63A+vnRDkj1JHh2q/ack307ycJK/SjJvqO2KJKNJnkhy/lB9RauNJlk3VF+U5N5WvzXJCa1+Ylsfbe0Lj9akJUmH53CuJG4CVoypbQPOqqpfBL4DXAGQZAmwCnhf2+aLSeYkmQNcD1wALAEuaX0BrgGurar3AvuBg5+hvQbY3+rXtn6SpGk0YUhU1TeAfWNqf11VB9rqdmBBW14JbKqqV6vqaWAUOKe9Rqvqqap6DdgErEwS4MPA7W37jcBFQ/va2JZvB5a3/pKkaXI03pP418CdbXk+8OxQ265W69XfDbw0FDgH6//fvlr7y63/6yRZm2RHkh179+6d8oQkSQNTCokknwUOAF85OsOZnKq6saqWVtXSkZGRmRyKJB1X5k52wyT/CvgtYHlVVSvvBs4Y6rag1ejUXwTmJZnbrhaG+x/c164kc4GTWn9J0jSZ1JVEkhXAp4HfrqofDDVtBla1J5MWAYuB+4D7gcXtSaYTGLy5vbmFyz3AxW371cAdQ/ta3ZYvBu4eCiNJ0jSY8EoiyS3Ah4BTk+wCrmTwNNOJwLb2XvL2qvo3VbUzyW3AYwxuQ11WVT9q+7kc2ArMATZU1c52iM8Am5J8AXgQWN/q64EvJxll8Mb5qqMwX0nSEZgwJKrqknHK68epHex/FXDVOPUtwJZx6k8xePppbP2HwEcnGp8k6Y3jb1xLkroMCUlSlyEhSeoyJCRJXYaEJKnLkJAkdRkSkqQuQ0KS1GVISJK6DAlJUpchIUnqMiQkSV2GhCSpy5CQJHUZEpKkLkNCktRlSEiSuiYMiSQbkuxJ8uhQ7ZQk25I82b6e3OpJcl2S0SQPJzl7aJvVrf+TSVYP1T+Q5JG2zXVpn4faO4YkafoczpXETcCKMbV1wF1VtRi4q60DXAAsbq+1wA0w+IHP4LOxz2XwUaVXDv3QvwH4xNB2KyY4hiRpmkwYElX1DWDfmPJKYGNb3ghcNFS/uQa2A/OSnA6cD2yrqn1VtR/YBqxobe+qqu1VVcDNY/Y13jEkSdNksu9JnFZVz7Xl54HT2vJ84Nmhfrta7VD1XePUD3WM10myNsmOJDv27t07ielIksYz5Teu2xVAHYWxTPoYVXVjVS2tqqUjIyNv5FAk6U1lsiHxQrtVRPu6p9V3A2cM9VvQaoeqLxinfqhjSJKmyWRDYjNw8Aml1cAdQ/VL21NOy4CX2y2jrcB5SU5ub1ifB2xtba8kWdaearp0zL7GO4YkaZrMnahDkluADwGnJtnF4Cmlq4HbkqwBvgt8rHXfAlwIjAI/AD4OUFX7knweuL/1+1xVHXwz/JMMnqB6G3Bne3GIY0iSpsmEIVFVl3Salo/Tt4DLOvvZAGwYp74DOGuc+ovjHUOSNH38jWtJUpchIUnqMiQkSV2GhCSpy5CQJHUZEpKkLkNCktRlSEiSugwJSVKXISFJ6jIkJEldhoQkqcuQkCR1GRKSpC5DQpLUZUhIkroMCUlS15RCIsm/S7IzyaNJbkny1iSLktybZDTJrUlOaH1PbOujrX3h0H6uaPUnkpw/VF/RaqNJ1k1lrJKkIzfpkEgyH/i3wNKqOguYA6wCrgGurar3AvuBNW2TNcD+Vr+29SPJkrbd+4AVwBeTzEkyB7geuABYAlzS+kqSpslUbzfNBd6WZC7wduA54MPA7a19I3BRW17Z1mnty5Ok1TdV1atV9TQwCpzTXqNV9VRVvQZsan0lSdNk0iFRVbuBPwG+xyAcXgYeAF6qqgOt2y5gflueDzzbtj3Q+r97uD5mm179dZKsTbIjyY69e/dOdkqSpDGmcrvpZAb/sl8E/GPgHQxuF027qrqxqpZW1dKRkZGZGIIkHZemcrvpN4Cnq2pvVf098FXgg8C8dvsJYAGwuy3vBs4AaO0nAS8O18ds06tLkqbJVELie8CyJG9v7y0sBx4D7gEubn1WA3e05c1tndZ+d1VVq69qTz8tAhYD9wH3A4vb01InMHhze/MUxitJOkJzJ+4yvqq6N8ntwDeBA8CDwI3A14BNSb7QauvbJuuBLycZBfYx+KFPVe1MchuDgDkAXFZVPwJIcjmwlcGTUxuqaudkxytJOnKTDgmAqroSuHJM+SkGTyaN7ftD4KOd/VwFXDVOfQuwZSpjlCRNnr9xLUnqMiQkSV2GhCSpy5CQJHUZEpKkLkNCktRlSEiSugwJSVKXISFJ6jIkJEldhoQkqcuQkCR1GRKSpC5DQpLUZUhIkroMCUlSlyEhSeqaUkgkmZfk9iTfTvJ4kl9JckqSbUmebF9Pbn2T5Loko0keTnL20H5Wt/5PJlk9VP9AkkfaNte1z9KWJE2TqV5J/BnwP6vqF4BfAh4H1gF3VdVi4K62DnABsLi91gI3ACQ5hcFHoJ7L4GNPrzwYLK3PJ4a2WzHF8UqSjsCkQyLJScCvAesBquq1qnoJWAlsbN02Ahe15ZXAzTWwHZiX5HTgfGBbVe2rqv3ANmBFa3tXVW2vqgJuHtqXJGkaTOVKYhGwF/ivSR5M8qUk7wBOq6rnWp/ngdPa8nzg2aHtd7Xaoeq7xqm/TpK1SXYk2bF3794pTEmSNGwqITEXOBu4oareD/xffnprCYB2BVBTOMZhqaobq2ppVS0dGRl5ow8nSW8aUwmJXcCuqrq3rd/OIDReaLeKaF/3tPbdwBlD2y9otUPVF4xTlyRNk0mHRFU9Dzyb5OdbaTnwGLAZOPiE0mrgjra8Gbi0PeW0DHi53ZbaCpyX5OT2hvV5wNbW9kqSZe2ppkuH9iVJmgZzp7j97wJfSXIC8BTwcQbBc1uSNcB3gY+1vluAC4FR4AetL1W1L8nngftbv89V1b62/EngJuBtwJ3tJUmaJlMKiap6CFg6TtPycfoWcFlnPxuADePUdwBnTWWMkqTJ8zeuJUldhoQkqcuQkCR1GRKSpC5DQpLUZUhIkroMCUlSlyEhSeoyJCRJXYaEJKnLkJAkdRkSkqQuQ0KS1GVISJK6DAlJUpchIUnqMiQkSV1TDokkc5I8mOR/tPVFSe5NMprk1vbRpiQ5sa2PtvaFQ/u4otWfSHL+UH1Fq40mWTfVsUqSjszRuJL4FPD40Po1wLVV9V5gP7Cm1dcA+1v92taPJEuAVcD7gBXAF1vwzAGuBy4AlgCXtL6SpGkypZBIsgD4CPClth7gw8DtrctG4KK2vLKt09qXt/4rgU1V9WpVPQ2MAue012hVPVVVrwGbWl9J0jSZ6pXEfwE+Dfy4rb8beKmqDrT1XcD8tjwfeBagtb/c+v+kPmabXv11kqxNsiPJjr17905xSpKkgyYdEkl+C9hTVQ8cxfFMSlXdWFVLq2rpyMjITA9Hko4bc6ew7QeB305yIfBW4F3AnwHzksxtVwsLgN2t/27gDGBXkrnAScCLQ/WDhrfp1SVJ02DSVxJVdUVVLaiqhQzeeL67qv45cA9wceu2GrijLW9u67T2u6uqWn1Ve/ppEbAYuA+4H1jcnpY6oR1j82THK0k6clO5kuj5DLApyReAB4H1rb4e+HKSUWAfgx/6VNXOJLcBjwEHgMuq6kcASS4HtgJzgA1VtfMNGK8kqeOohERVfR34elt+isGTSWP7/BD4aGf7q4CrxqlvAbYcjTFKko6cv3EtSeoyJCRJXYaEJKnLkJAkdRkSkqQuQ0KS1GVISJK6DAlJUpchIUnqMiQkSV2GhCSpy5CQJHUZEpKkLkNCktRlSEiSugwJSVKXISFJ6pp0SCQ5I8k9SR5LsjPJp1r9lCTbkjzZvp7c6klyXZLRJA8nOXtoX6tb/yeTrB6qfyDJI22b65JkKpOVJB2ZqVxJHAB+v6qWAMuAy5IsAdYBd1XVYuCutg5wAbC4vdYCN8AgVIArgXMZfOzplQeDpfX5xNB2K6YwXknSEZp0SFTVc1X1zbb8t8DjwHxgJbCxddsIXNSWVwI318B2YF6S04HzgW1Vta+q9gPbgBWt7V1Vtb2qCrh5aF+SpGlwVN6TSLIQeD9wL3BaVT3Xmp4HTmvL84Fnhzbb1WqHqu8apz7e8dcm2ZFkx969e6c0F0nST005JJK8E/hL4Peq6pXhtnYFUFM9xkSq6saqWlpVS0dGRt7ow0nSm8bcqWyc5C0MAuIrVfXVVn4hyelV9Vy7ZbSn1XcDZwxtvqDVdgMfGlP/eqsvGKe/moXrvjbTQzhsz1z9kZkegqRJmMrTTQHWA49X1Z8ONW0GDj6htBq4Y6h+aXvKaRnwcrsttRU4L8nJ7Q3r84Ctre2VJMvasS4d2pckaRpM5Urig8C/BB5J8lCr/SFwNXBbkjXAd4GPtbYtwIXAKPAD4OMAVbUvyeeB+1u/z1XVvrb8SeAm4G3Ane0lSZomkw6JqvrfQO/3FpaP07+Ayzr72gBsGKe+AzhrsmOUJE2Nv3EtSeoyJCRJXYaEJKnLkJAkdRkSkqQuQ0KS1GVISJK6DAlJUpchIUnqMiQkSV2GhCSpy5CQJHUZEpKkLkNCktRlSEiSugwJSVKXISFJ6jrmQyLJiiRPJBlNsm6mxyNJbyZT+YzrN1ySOcD1wG8Cu4D7k2yuqsdmdmQ6UgvXfW2mh3BEnrn6IzM9BOmYcKxfSZwDjFbVU1X1GrAJWDnDY5KkN41j+koCmA88O7S+Czh3bKcka4G1bfXvkjwxiWOdCnx/Etsdq46n+Uz7XHLNG7p7z82x6808n38yXvFYD4nDUlU3AjdOZR9JdlTV0qM0pBl3PM3neJoLHF/zOZ7mAs5nPMf67abdwBlD6wtaTZI0DY71kLgfWJxkUZITgFXA5hkekyS9aRzTt5uq6kCSy4GtwBxgQ1XtfIMON6XbVceg42k+x9Nc4Piaz/E0F3A+r5OqOhoDkSQdh471202SpBlkSEiSugwJZv+f/kjyTJJHkjyUZEernZJkW5In29eTZ3qcPUk2JNmT5NGh2rjjz8B17Vw9nOTsmRv5+Drz+aMku9s5eijJhUNtV7T5PJHk/JkZ9fiSnJHkniSPJdmZ5FOtPuvOzyHmMlvPzVuT3JfkW20+/7HVFyW5t4371vbQD0lObOujrX3hYR2oqt7ULwZviP8N8B7gBOBbwJKZHtcRzuEZ4NQxtT8G1rXldcA1Mz3OQ4z/14CzgUcnGj9wIXAnEGAZcO9Mj/8w5/NHwB+M03dJ+547EVjUvhfnzPQchsZ3OnB2W/4Z4DttzLPu/BxiLrP13AR4Z1t+C3Bv+29+G7Cq1f8c+J22/Engz9vyKuDWwzmOVxLH75/+WAlsbMsbgYtmcCyHVFXfAPaNKffGvxK4uQa2A/OSnD49Iz08nfn0rAQ2VdWrVfU0MMrge/KYUFXPVdU32/LfAo8z+EsIs+78HGIuPcf6uamq+ru2+pb2KuDDwO2tPvbcHDxntwPLk2Si4xgS4//pj0N94xyLCvjrJA+0P1ECcFpVPdeWnwdOm5mhTVpv/LP5fF3ebsFsGLr9N2vm025PvJ/Bv1hn9fkZMxeYpecmyZwkDwF7gG0MrnZeqqoDrcvwmH8yn9b+MvDuiY5hSBwffrWqzgYuAC5L8mvDjTW4vpy1zzrP9vE3NwA/C/wy8Bzwn2d2OEcmyTuBvwR+r6peGW6bbednnLnM2nNTVT+qql9m8NcozgF+4Wgfw5A4Dv70R1Xtbl/3AH/F4JvlhYOX+e3rnpkb4aT0xj8rz1dVvdD+h/4x8Bf89LbFMT+fJG9h8EP1K1X11VaelednvLnM5nNzUFW9BNwD/AqDW3wHf1F6eMw/mU9rPwl4caJ9GxKz/E9/JHlHkp85uAycBzzKYA6rW7fVwB0zM8JJ641/M3Bpe4pmGfDy0G2PY9aY+/L/jME5gsF8VrUnTxYBi4H7pnt8Pe2e9Xrg8ar606GmWXd+enOZxedmJMm8tvw2Bp+78ziDsLi4dRt7bg6es4uBu9tV4KHN9Dv0x8KLwRMZ32FwP++zMz2eIxz7exg8gfEtYOfB8TO413gX8CTwv4BTZnqsh5jDLQwu8/+ewT3UNb3xM3ii4/p2rh4Bls70+A9zPl9u4324/c96+lD/z7b5PAFcMNPjHzOXX2VwK+lh4KH2unA2np9DzGW2nptfBB5s434U+A+t/h4GYTYK/DfgxFZ/a1sfbe3vOZzj+Gc5JEld3m6SJHUZEpKkLkNCktRlSEiSugwJSVKXISFJ6jIkJEld/w854y0AXzoObAAAAABJRU5ErkJggg==\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": [],
            "needs_background": "light"
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "BH7_Yaz1U9yJ"
      },
      "source": [
        "Looks like the vast majority of sentences are between 0 and 50 tokens in length.\n",
        "\n",
        "We can use NumPy's [`percentile`](https://numpy.org/doc/stable/reference/generated/numpy.percentile.html) to find the value which covers 95% of the sentence lengths."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "4e5nUagxr4r5",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "407d355c-cc4f-43fe-8cc1-24387d8d2bbc"
      },
      "source": [
        "# How long of a sentence covers 95% of the lengths?\n",
        "output_seq_len = int(np.percentile(sent_lens, 95))\n",
        "output_seq_len"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "55"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 43
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Nhre7MPBVfK2"
      },
      "source": [
        "Wonderful! It looks like 95% of the sentences in our training set have a length of 55 tokens or less.\n",
        "\n",
        "When we create our tokenization layer, we'll use this value to turn all of our sentences into the same length. Meaning sentences with a length below 55 get padded with zeros and sentences with a length above 55 get truncated (words after 55 get cut off).\n",
        "\n",
        "> 🤔 **Question:** Why 95%?\n",
        "\n",
        "We could use the max sentence length of the sentences in the training set."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "oEZbyvh1WCBw",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "9c3d865b-ad12-49dd-e014-cc947ca71ef4"
      },
      "source": [
        "# Maximum sentence length in the training set\n",
        "max(sent_lens)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "296"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 44
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tIWIlFV4WF8R"
      },
      "source": [
        "However, since hardly any sentences even come close to the max length, it would mean the majority of the data we pass to our model would be zeros (sinces all sentences below the max length would get padded with zeros).\n",
        "\n",
        "> 🔑 **Note:** The steps we've gone through are good practice when working with a text corpus for a NLP problem. You want to know how long your samples are and what the distribution of them is. See section 4 Data Analysis of the [PubMed 200k RCT paper](https://arxiv.org/pdf/1710.06071.pdf) for further examples."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uvhbRw7-uMwH"
      },
      "source": [
        "### Create text vectorizer\n",
        "\n",
        "Now we've got a little more information about our texts, let's create a way to turn it into numbers.\n",
        "\n",
        "To do so, we'll use the [`TextVectorization`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/TextVectorization) layer from TensorFlow.\n",
        "\n",
        "We'll keep all the parameters default except for `max_tokens` (the number of unique words in our dataset) and `output_sequence_length` (our desired output length for each vectorized sentence).\n",
        "\n",
        "Section 3.2 of the [PubMed 200k RCT paper](https://arxiv.org/pdf/1710.06071.pdf) states the vocabulary size of the PubMed 20k dataset as 68,000. So we'll use that as our `max_tokens` parameter."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "xniPYW60uzby"
      },
      "source": [
        "# How many words are in our vocabulary? (taken from 3.2 in https://arxiv.org/pdf/1710.06071.pdf)\n",
        "max_tokens = 68000"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tu25jIo-YSuW"
      },
      "source": [
        "And since discovered a sentence length of 55 covers 95% of the training sentences, we'll use that as our `output_sequence_length` parameter."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "gtfQ27MNpy-v"
      },
      "source": [
        "# Create text vectorizer\n",
        "from tensorflow.keras.layers.experimental.preprocessing import TextVectorization\n",
        "\n",
        "text_vectorizer = TextVectorization(max_tokens=max_tokens, # number of words in vocabulary\n",
        "                                    output_sequence_length=55) # desired output length of vectorized sequences"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "y_Y7U8SdY0bO"
      },
      "source": [
        "Great! Looks like our `text_vectorizer` is ready, let's adapt it to the training data (let it read the training data and figure out what number should represent what word) and then test it out. "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "AbJtmyd1sWW8"
      },
      "source": [
        "# Adapt text vectorizer to training sentences\n",
        "text_vectorizer.adapt(train_sentences)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "gVZDwaymsbLa",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "0e74e147-a36d-4e5f-a6c4-3d2da572ded2"
      },
      "source": [
        "# Test out text vectorizer\n",
        "import random\n",
        "target_sentence = random.choice(train_sentences)\n",
        "print(f\"Text:\\n{target_sentence}\")\n",
        "print(f\"\\nLength of text: {len(target_sentence.split())}\")\n",
        "print(f\"\\nVectorized text:\\n{text_vectorizer([target_sentence])}\")"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Text:\n",
            "by randomisation , patients were allocated to usual hospital care or hospital-at-home , which included discharge at day @ of admission , followed by home treatment with homes visits by community nurses until day @ of treatment .\n",
            "\n",
            "Length of text: 38\n",
            "\n",
            "Vectorized text:\n",
            "[[   22   934    12     9   379     6   370   237    77    16 18633   126\n",
            "    121   696    15   108     4  1041   284    22   548    19     7  2824\n",
            "    620    22   613  1583   634   108     4    19     0     0     0     0\n",
            "      0     0     0     0     0     0     0     0     0     0     0     0\n",
            "      0     0     0     0     0     0     0]]\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wdKJpk-8sjYn"
      },
      "source": [
        "Cool, we've now got a way to turn our sequences into numbers.\n",
        "\n",
        "> 🛠 **Exercise:** Try running the cell above a dozen or so times. What do you notice about sequences with a length less than 55?\n",
        "\n",
        "Using the [`get_vocabulary()`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/TextVectorization) method of our `text_vectorizer` we can find out a few different tidbits about our text."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "IS80FGEhsgVe",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "ffc95291-4259-4ce5-e0a0-6370727e6d9f"
      },
      "source": [
        "# How many words in our training vocabulary?\n",
        "rct_20k_text_vocab = text_vectorizer.get_vocabulary()\n",
        "print(f\"Number of words in vocabulary: {len(rct_20k_text_vocab)}\"), \n",
        "print(f\"Most common words in the vocabulary: {rct_20k_text_vocab[:5]}\")\n",
        "print(f\"Least common words in the vocabulary: {rct_20k_text_vocab[-5:]}\")"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Number of words in vocabulary: 64841\n",
            "Most common words in the vocabulary: ['', '[UNK]', 'the', 'and', 'of']\n",
            "Least common words in the vocabulary: ['aainduced', 'aaigroup', 'aachener', 'aachen', 'aaacp']\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "w4F6atcSa26q"
      },
      "source": [
        "And if we wanted to figure out the configuration of our `text_vectorizer` we can use the `get_config()` method."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Ly5BSLkGZnPO",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "167234f4-ca80-45b8-a4d5-168b55aaec65"
      },
      "source": [
        "# Get the config of our text vectorizer\n",
        "text_vectorizer.get_config()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "{'dtype': 'string',\n",
              " 'max_tokens': 68000,\n",
              " 'name': 'text_vectorization',\n",
              " 'ngrams': None,\n",
              " 'output_mode': 'int',\n",
              " 'output_sequence_length': 55,\n",
              " 'pad_to_max_tokens': True,\n",
              " 'split': 'whitespace',\n",
              " 'standardize': 'lower_and_strip_punctuation',\n",
              " 'trainable': True}"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 50
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "GZvDSTrTp1Wy"
      },
      "source": [
        "### Create custom text embedding\n",
        "\n",
        "Our `token_vectorization` layer maps the words in our text directly to numbers. However, this doesn't necessarily capture the relationships between those numbers.\n",
        "\n",
        "To create a richer numerical representation of our text, we can use an **embedding**.\n",
        "\n",
        "As our model learns (by going through many different examples of abstract sentences and their labels), it'll update its embedding to better represent the relationships between tokens in our corpus.\n",
        "\n",
        "We can create a trainable embedding layer using TensorFlow's [`Embedding`](https://www.tensorflow.org/tutorials/text/word_embeddings) layer.\n",
        "\n",
        "Once again, the main parameters we're concerned with here are the inputs and outputs of our `Embedding` layer.\n",
        "\n",
        "The `input_dim` parameter defines the size of our vocabulary. And the `output_dim` parameter defines the dimension of the embedding output.\n",
        "\n",
        "Once created, our embedding layer will take the integer outputs of our `text_vectorization` layer as inputs and convert them to feature vectors of size `output_dim`.\n",
        "\n",
        "Let's see it in action."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "AIKPM2QOuLQv",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "4be7bb07-5da3-43a9-a2d4-119def0f5d5c"
      },
      "source": [
        "# Create token embedding layer\n",
        "token_embed = layers.Embedding(input_dim=len(rct_20k_text_vocab), # length of vocabulary\n",
        "                               output_dim=128, # Note: different embedding sizes result in drastically different numbers of parameters to train\n",
        "                               # Use masking to handle variable sequence lengths (save space)\n",
        "                               mask_zero=True,\n",
        "                               name=\"token_embedding\") \n",
        "\n",
        "# Show example embedding\n",
        "print(f\"Sentence before vectorization:\\n{target_sentence}\\n\")\n",
        "vectorized_sentence = text_vectorizer([target_sentence])\n",
        "print(f\"Sentence after vectorization (before embedding):\\n{vectorized_sentence}\\n\")\n",
        "embedded_sentence = token_embed(vectorized_sentence)\n",
        "print(f\"Sentence after embedding:\\n{embedded_sentence}\\n\")\n",
        "print(f\"Embedded sentence shape: {embedded_sentence.shape}\")"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Sentence before vectorization:\n",
            "by randomisation , patients were allocated to usual hospital care or hospital-at-home , which included discharge at day @ of admission , followed by home treatment with homes visits by community nurses until day @ of treatment .\n",
            "\n",
            "Sentence after vectorization (before embedding):\n",
            "[[   22   934    12     9   379     6   370   237    77    16 18633   126\n",
            "    121   696    15   108     4  1041   284    22   548    19     7  2824\n",
            "    620    22   613  1583   634   108     4    19     0     0     0     0\n",
            "      0     0     0     0     0     0     0     0     0     0     0     0\n",
            "      0     0     0     0     0     0     0]]\n",
            "\n",
            "Sentence after embedding:\n",
            "[[[ 0.01310552 -0.0399843  -0.02923425 ... -0.00507909  0.03536068\n",
            "   -0.0158527 ]\n",
            "  [-0.04288385  0.0036533   0.04550033 ...  0.04761951 -0.01333101\n",
            "    0.02340751]\n",
            "  [ 0.00665864 -0.03900324  0.0316339  ...  0.0469101   0.02812362\n",
            "   -0.0136387 ]\n",
            "  ...\n",
            "  [ 0.03553898 -0.04906787 -0.0030926  ...  0.00124459  0.00030689\n",
            "   -0.03466149]\n",
            "  [ 0.03553898 -0.04906787 -0.0030926  ...  0.00124459  0.00030689\n",
            "   -0.03466149]\n",
            "  [ 0.03553898 -0.04906787 -0.0030926  ...  0.00124459  0.00030689\n",
            "   -0.03466149]]]\n",
            "\n",
            "Embedded sentence shape: (1, 55, 128)\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "l5tDy1PRfvZ0"
      },
      "source": [
        "## Create datasets (as fast as possible)\n",
        "\n",
        "We've gone through all the trouble of preprocessing our datasets to be used with a machine learning model, however, there are still a few steps we can use to make them work faster with our models.\n",
        "\n",
        "Namely, the `tf.data` API provides methods which enable faster data loading.\n",
        "\n",
        "> 📖 **Resource:** For best practices on data loading in TensorFlow, check out the following:\n",
        "* [tf.data: Build TensorFlow input pipelines](https://www.tensorflow.org/guide/data)\n",
        "* [Better performance with the tf.data API](https://www.tensorflow.org/guide/data_performance)\n",
        "\n",
        "The main steps we'll want to use with our data is to turn it into a `PrefetchDataset` of batches.\n",
        "\n",
        "Doing so we'll ensure TensorFlow loads our data onto the GPU as fast as possible, in turn leading to faster training time.\n",
        "\n",
        "To create a batched `PrefetchDataset` we can use the methods [`batch()`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#batch) and [`prefetch()`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#prefetch), the parameter [`tf.data.AUTOTUNE`](https://www.tensorflow.org/api_docs/python/tf/data#AUTOTUNE) will also allow TensorFlow to determine the optimal amount of compute to use to prepare datasets."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "tan6Ekiwfza5",
        "outputId": "6b35345d-bfce-4119-ae30-24f6d29846e0"
      },
      "source": [
        "# Turn our data into TensorFlow Datasets\n",
        "train_dataset = tf.data.Dataset.from_tensor_slices((train_sentences, train_labels_one_hot))\n",
        "valid_dataset = tf.data.Dataset.from_tensor_slices((val_sentences, val_labels_one_hot))\n",
        "test_dataset = tf.data.Dataset.from_tensor_slices((test_sentences, test_labels_one_hot))\n",
        "\n",
        "train_dataset"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<TensorSliceDataset shapes: ((), (5,)), types: (tf.string, tf.float64)>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 53
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "dnEJakTxgJWx",
        "outputId": "272cb50e-cff3-4093-b689-f2b5811c070d"
      },
      "source": [
        "# Take the TensorSliceDataset's and turn them into prefetched batches\n",
        "train_dataset = train_dataset.batch(32).prefetch(tf.data.AUTOTUNE)\n",
        "valid_dataset = valid_dataset.batch(32).prefetch(tf.data.AUTOTUNE)\n",
        "test_dataset = test_dataset.batch(32).prefetch(tf.data.AUTOTUNE)\n",
        "\n",
        "train_dataset"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<PrefetchDataset shapes: ((None,), (None, 5)), types: (tf.string, tf.float64)>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 54
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HeE3wo4QvOlR"
      },
      "source": [
        "## Model 1: Conv1D with token embeddings\n",
        "\n",
        "Alright, we've now got a way to numerically represent our text and labels, time to build a series of deep models to try and improve upon our baseline.\n",
        "\n",
        "All of our deep models will follow a similar structure:\n",
        "\n",
        "```\n",
        "Input (text) -> Tokenize -> Embedding -> Layers -> Output (label probability)\n",
        "```\n",
        "\n",
        "The main component we'll be changing throughout is the `Layers` component. Because any modern deep NLP model requires text to be converted into an embedding before meaningful patterns can be discovered within.\n",
        "\n",
        "The first model we're going to build is a 1-dimensional Convolutional Neural Network. \n",
        "\n",
        "We're also going to be following the standard machine learning workflow of:\n",
        "- Build model\n",
        "- Train model\n",
        "- Evaluate model (make predictions and compare to ground truth)\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "oTW5buTKvRR6"
      },
      "source": [
        "# Create 1D convolutional model to process sequences\n",
        "inputs = layers.Input(shape=(1,), dtype=tf.string)\n",
        "text_vectors = text_vectorizer(inputs) # vectorize text inputs\n",
        "token_embeddings = token_embed(text_vectors) # create embedding\n",
        "x = layers.Conv1D(64, kernel_size=5, padding=\"same\", activation=\"relu\")(token_embeddings)\n",
        "x = layers.GlobalAveragePooling1D()(x) # condense the output of our feature vector\n",
        "outputs = layers.Dense(num_classes, activation=\"softmax\")(x)\n",
        "model_1 = tf.keras.Model(inputs, outputs)\n",
        "\n",
        "# Compile\n",
        "model_1.compile(loss=\"categorical_crossentropy\", # if your labels are integer form (not one hot) use sparse_categorical_crossentropy\n",
        "                optimizer=tf.keras.optimizers.Adam(),\n",
        "                metrics=[\"accuracy\"])"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "aOaXSsZjnKmy",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "367daee0-1e45-4813-f4b6-34c6f32cbc95"
      },
      "source": [
        "# Get summary of Conv1D model\n",
        "model_1.summary()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Model: \"model\"\n",
            "_________________________________________________________________\n",
            "Layer (type)                 Output Shape              Param #   \n",
            "=================================================================\n",
            "input_1 (InputLayer)         [(None, 1)]               0         \n",
            "_________________________________________________________________\n",
            "text_vectorization (TextVect (None, 55)                0         \n",
            "_________________________________________________________________\n",
            "token_embedding (Embedding)  (None, 55, 128)           8299648   \n",
            "_________________________________________________________________\n",
            "conv1d (Conv1D)              (None, 55, 64)            41024     \n",
            "_________________________________________________________________\n",
            "global_average_pooling1d (Gl (None, 64)                0         \n",
            "_________________________________________________________________\n",
            "dense (Dense)                (None, 5)                 325       \n",
            "=================================================================\n",
            "Total params: 8,340,997\n",
            "Trainable params: 8,340,997\n",
            "Non-trainable params: 0\n",
            "_________________________________________________________________\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-gZdAVJJ3vc2"
      },
      "source": [
        "Wonderful! We've got our first deep sequence model built and ready to go. \n",
        "\n",
        "Checking out the model summary, you'll notice the majority of the trainable parameters are within the embedding layer. If we were to increase the size of the embedding (by increasing the `output_dim` parameter of the `Embedding` layer), the number of trainable parameters would increase dramatically.\n",
        "\n",
        "It's time to fit our model to the training data but we're going to make a mindful change.\n",
        "\n",
        "Since our training data contains nearly 200,000 sentences, fitting a deep model may take a while even with a GPU. So to keep our experiments swift, we're going to run them on a subset of the training dataset.\n",
        "\n",
        "More specifically, we'll only use the first 10% of batches (about 18,000 samples) of the training set to train on and the first 10% of batches from the validation set to validate on.\n",
        "\n",
        "> 🔑 **Note:** It's a standard practice in machine learning to test your models on smaller subsets of data first to make sure they work before scaling them to larger amounts of data. You should aim to run many smaller experiments rather than only a handful of large experiments. And since your time is limited, one of the best ways to run smaller experiments is to reduce the amount of data you're working with (10% of the full dataset is usually a good amount, as long as it covers a similar distribution)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "IKpHoDysgvdC",
        "outputId": "69e96467-3ede-47a3-cf7e-6c2041b1db01"
      },
      "source": [
        "# Fit the model\n",
        "model_1_history = model_1.fit(train_dataset,\n",
        "                              steps_per_epoch=int(0.1 * len(train_dataset)), # only fit on 10% of batches for faster training time\n",
        "                              epochs=3,\n",
        "                              validation_data=valid_dataset,\n",
        "                              validation_steps=int(0.1 * len(valid_dataset))) # only validate on 10% of batches"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Epoch 1/3\n",
            "562/562 [==============================] - 78s 80ms/step - loss: 1.1791 - accuracy: 0.5223 - val_loss: 0.6836 - val_accuracy: 0.7384\n",
            "Epoch 2/3\n",
            "562/562 [==============================] - 44s 79ms/step - loss: 0.6773 - accuracy: 0.7474 - val_loss: 0.6335 - val_accuracy: 0.7683\n",
            "Epoch 3/3\n",
            "562/562 [==============================] - 44s 79ms/step - loss: 0.6181 - accuracy: 0.7727 - val_loss: 0.5915 - val_accuracy: 0.7846\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "RQrRp3ar8GQV"
      },
      "source": [
        "Brilliant! We've got our first trained deep sequence model, and it didn't take too long (and if we didn't prefetch our batched data, it would've taken longer).\n",
        "\n",
        "Time to make some predictions with our model and then evaluate them."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "WYvFOIBvhjpX",
        "outputId": "7deb88b0-a3fb-4ff3-86c2-77de0230e5cd"
      },
      "source": [
        "# Evaluate on whole validation dataset (we only validated on 10% of batches during training)\n",
        "model_1.evaluate(valid_dataset)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "945/945 [==============================] - 3s 3ms/step - loss: 0.5935 - accuracy: 0.7872\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "[0.5934624075889587, 0.7871706485748291]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 58
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "jAAtBWO2iRft",
        "outputId": "19e21e20-7c1f-450c-c4d8-e6d3a30864ac"
      },
      "source": [
        "# Make predictions (our model outputs prediction probabilities for each class)\n",
        "model_1_pred_probs = model_1.predict(valid_dataset)\n",
        "model_1_pred_probs"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "array([[4.67370778e-01, 1.53675690e-01, 9.07374471e-02, 2.61693090e-01,\n",
              "        2.65230015e-02],\n",
              "       [4.39375877e-01, 2.51222134e-01, 1.34108905e-02, 2.87128955e-01,\n",
              "        8.86206981e-03],\n",
              "       [1.41407549e-01, 7.71213975e-03, 1.85112190e-03, 8.48973274e-01,\n",
              "        5.58596548e-05],\n",
              "       ...,\n",
              "       [3.86551710e-06, 5.48208598e-04, 7.84297357e-04, 2.40832605e-06,\n",
              "        9.98661160e-01],\n",
              "       [6.00541234e-02, 4.66211498e-01, 1.18867174e-01, 7.14003071e-02,\n",
              "        2.83466935e-01],\n",
              "       [1.38142675e-01, 7.17261851e-01, 4.50793169e-02, 3.48987989e-02,\n",
              "        6.46174029e-02]], dtype=float32)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 59
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "9ydUpF6cqMll",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "00be2a91-59e1-4f4e-b665-d0d3b675b3c1"
      },
      "source": [
        "# Convert pred probs to classes\n",
        "model_1_preds = tf.argmax(model_1_pred_probs, axis=1)\n",
        "model_1_preds"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<tf.Tensor: shape=(30212,), dtype=int64, numpy=array([0, 0, 3, ..., 4, 1, 1])>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 60
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "KMfRLv0omdY4",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "2004c746-a6c5-4e80-b485-d5f04714b244"
      },
      "source": [
        "# Calculate model_1 results\n",
        "model_1_results = calculate_results(y_true=val_labels_encoded,\n",
        "                                    y_pred=model_1_preds)\n",
        "model_1_results"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "{'accuracy': 78.71706606646366,\n",
              " 'f1': 0.7846799283214887,\n",
              " 'precision': 0.783721647805203,\n",
              " 'recall': 0.7871706606646366}"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 61
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qU1u4KlWvAQa"
      },
      "source": [
        "## Model 2: Feature extraction with pretrained token embeddings\n",
        "\n",
        "Training our own embeddings took a little while to run, slowing our experiments down.\n",
        "\n",
        "Since we're moving towards replicating the model architecture in [*Neural Networks for Joint Sentence Classification\n",
        "in Medical Paper Abstracts*](https://arxiv.org/pdf/1612.05251.pdf), it mentions they used a [pretrained GloVe embedding](https://nlp.stanford.edu/projects/glove/) as a way to initialise their token embeddings.\n",
        "\n",
        "To emulate this, let's see what results we can get with the [pretrained Universal Sentence Encoder embeddings from TensorFlow Hub](https://tfhub.dev/google/universal-sentence-encoder/4).\n",
        "\n",
        "> 🔑 **Note:** We could use GloVe embeddings as per the paper but since we're working with TensorFlow, we'll use what's available from TensorFlow Hub (GloVe embeddings aren't). We'll save [using pretrained GloVe embeddings](https://keras.io/examples/nlp/pretrained_word_embeddings/) as an extension.\n",
        "\n",
        "The model structure will look like:\n",
        "\n",
        "```\n",
        "Inputs (string) -> Pretrained embeddings from TensorFlow Hub (Universal Sentence Encoder) -> Layers -> Output (prediction probabilities)\n",
        "```\n",
        "\n",
        "You'll notice the lack of tokenization layer we've used in a previous model. This is because the Universal Sentence Encoder (USE) takes care of tokenization for us.\n",
        "\n",
        "This type of model is called transfer learning, or more specifically, **feature extraction transfer learning**. In other words, taking the patterns a model has learned elsewhere and applying it to our own problem.\n",
        "\n",
        "![TensorFlow Hub Universal Feature Encoder feature extractor model we're building](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/09-model-tf-hub-USE-to-dense-layer.png)\n",
        "*The feature extractor model we're building using a pretrained embedding from TensorFlow Hub.*\n",
        "\n",
        "To download the pretrained USE into a layer we can use in our model, we can use the [`hub.KerasLayer`](https://www.tensorflow.org/hub/api_docs/python/hub/KerasLayer) class.\n",
        "\n",
        "We'll keep the pretrained embeddings frozen (by setting `trainable=False`) and add a trainable couple of layers on the top to tailor the model outputs to our own data.\n",
        "\n",
        "> 🔑 **Note:** Due to having to download a relatively large model (~916MB), the cell below may take a little while to run."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "hk8mJUNy0xOO"
      },
      "source": [
        "# Download pretrained TensorFlow Hub USE\n",
        "import tensorflow_hub as hub\n",
        "tf_hub_embedding_layer = hub.KerasLayer(\"https://tfhub.dev/google/universal-sentence-encoder/4\",\n",
        "                                        trainable=False,\n",
        "                                        name=\"universal_sentence_encoder\")"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qOv1JQh1JdW0"
      },
      "source": [
        "Beautiful, now our pretrained USE is downloaded and instantiated as a `hub.KerasLayer` instance, let's test it out on a random sentence."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "f5gCkZgYJYSi",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "f1c50f5f-6c94-4f16-c584-256e8379b4fb"
      },
      "source": [
        "# Test out the embedding on a random sentence\n",
        "random_training_sentence = random.choice(train_sentences)\n",
        "print(f\"Random training sentence:\\n{random_training_sentence}\\n\")\n",
        "use_embedded_sentence = tf_hub_embedding_layer([random_training_sentence])\n",
        "print(f\"Sentence after embedding:\\n{use_embedded_sentence[0][:30]} (truncated output)...\\n\")\n",
        "print(f\"Length of sentence embedding:\\n{len(use_embedded_sentence[0])}\")"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Random training sentence:\n",
            "we did a multicentre , randomised , double-blind , phase @ trial in which men with at least one bone metastasis from castration-resistant prostate cancer that had progressed after docetaxel treatment were randomly assigned in a @:@ ratio to receive bone-directed radiotherapy ( @ gy in one fraction ) followed by either ipilimumab @ mg/kg or placebo every @ weeks for up to four doses .\n",
            "\n",
            "Sentence after embedding:\n",
            "[-0.06267194 -0.02363393 -0.02826927 -0.06744404 -0.01379427 -0.06991897\n",
            " -0.00414775 -0.0494267   0.03362229  0.0004936   0.08354022 -0.05528864\n",
            "  0.05046206 -0.01984152  0.01494281 -0.01103383 -0.08360168 -0.04366067\n",
            "  0.00732924  0.05037152 -0.00068059 -0.00691372 -0.05995404  0.00651074\n",
            "  0.0075112   0.06902377 -0.00548509 -0.07498414 -0.04433047  0.07259601] (truncated output)...\n",
            "\n",
            "Length of sentence embedding:\n",
            "512\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "rB98xmH4KO-0"
      },
      "source": [
        "Nice! As we mentioned before the pretrained USE module from TensorFlow Hub takes care of tokenizing our text for us and outputs a 512 dimensional embedding vector.\n",
        "\n",
        "Let's put together and compile a model using our `tf_hub_embedding_layer`."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uJue6QIthOZD"
      },
      "source": [
        "### Building and fitting an NLP feature extraction model from TensorFlow Hub"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "So4lSnW_2F1i"
      },
      "source": [
        "# Define feature extractor model using TF Hub layer\n",
        "inputs = layers.Input(shape=[], dtype=tf.string)\n",
        "pretrained_embedding = tf_hub_embedding_layer(inputs) # tokenize text and create embedding\n",
        "x = layers.Dense(128, activation=\"relu\")(pretrained_embedding) # add a fully connected layer on top of the embedding\n",
        "# Note: you could add more layers here if you wanted to\n",
        "outputs = layers.Dense(5, activation=\"softmax\")(x) # create the output layer\n",
        "model_2 = tf.keras.Model(inputs=inputs,\n",
        "                        outputs=outputs)\n",
        "\n",
        "# Compile the model\n",
        "model_2.compile(loss=\"categorical_crossentropy\",\n",
        "                optimizer=tf.keras.optimizers.Adam(),\n",
        "                metrics=[\"accuracy\"])"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "39r3jhefoKWG",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "1f49faab-153d-4c93-f43d-cef285d01a14"
      },
      "source": [
        "# Get a summary of the model\n",
        "model_2.summary()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Model: \"model_1\"\n",
            "_________________________________________________________________\n",
            "Layer (type)                 Output Shape              Param #   \n",
            "=================================================================\n",
            "input_2 (InputLayer)         [(None,)]                 0         \n",
            "_________________________________________________________________\n",
            "universal_sentence_encoder ( (None, 512)               256797824 \n",
            "_________________________________________________________________\n",
            "dense_1 (Dense)              (None, 128)               65664     \n",
            "_________________________________________________________________\n",
            "dense_2 (Dense)              (None, 5)                 645       \n",
            "=================================================================\n",
            "Total params: 256,864,133\n",
            "Trainable params: 66,309\n",
            "Non-trainable params: 256,797,824\n",
            "_________________________________________________________________\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5Exs-vDmLIs6"
      },
      "source": [
        "Checking the summary of our model we can see there's a large number of total parameters, however, the majority of these are non-trainable. This is because we set `training=False` when we instatiated our USE feature extractor layer.\n",
        "\n",
        "So when we train our model, only the top two output layers will be trained."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "ttJKg6cDihGd",
        "outputId": "ebb93e16-6247-4b25-9d1f-a33fc7c85c67"
      },
      "source": [
        "# Fit feature extractor model for 3 epochs\n",
        "model_2.fit(train_dataset,\n",
        "            steps_per_epoch=int(0.1 * len(train_dataset)),\n",
        "            epochs=3,\n",
        "            validation_data=valid_dataset,\n",
        "            validation_steps=int(0.1 * len(valid_dataset)))"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Epoch 1/3\n",
            "562/562 [==============================] - 8s 11ms/step - loss: 1.0892 - accuracy: 0.5935 - val_loss: 0.7980 - val_accuracy: 0.6912\n",
            "Epoch 2/3\n",
            "562/562 [==============================] - 6s 11ms/step - loss: 0.7726 - accuracy: 0.7026 - val_loss: 0.7562 - val_accuracy: 0.6995\n",
            "Epoch 3/3\n",
            "562/562 [==============================] - 6s 11ms/step - loss: 0.7572 - accuracy: 0.7132 - val_loss: 0.7388 - val_accuracy: 0.7148\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<tensorflow.python.keras.callbacks.History at 0x7fe874696d10>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 70
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "tz8TMzLrjJYm",
        "outputId": "aa6ade61-5ef6-4a21-e78e-4e6bab3d9bca"
      },
      "source": [
        "# Evaluate on whole validation dataset\n",
        "model_2.evaluate(valid_dataset)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "945/945 [==============================] - 9s 9ms/step - loss: 0.7410 - accuracy: 0.7140\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "[0.7409555315971375, 0.713954746723175]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 72
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YmLdj-1tLk3X"
      },
      "source": [
        "Since we aren't training our own custom embedding layer, training is much quicker.\n",
        "\n",
        "Let's make some predictions and evaluate our feature extraction model."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "2oe5UxcgqvA2",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "d65fec78-c9e7-4c2b-b26e-380f110c6a2a"
      },
      "source": [
        "# Make predictions with feature extraction model\n",
        "model_2_pred_probs = model_2.predict(valid_dataset)\n",
        "model_2_pred_probs"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "array([[0.4062823 , 0.38889027, 0.00269767, 0.19336382, 0.00876602],\n",
              "       [0.33066782, 0.50565475, 0.00388181, 0.15708618, 0.00270937],\n",
              "       [0.24373452, 0.13988586, 0.01533568, 0.5568524 , 0.04419148],\n",
              "       ...,\n",
              "       [0.00216129, 0.00626071, 0.05524343, 0.00110265, 0.935232  ],\n",
              "       [0.00387926, 0.04658042, 0.20013681, 0.00145556, 0.74794793],\n",
              "       [0.17956498, 0.27563578, 0.46396002, 0.00653344, 0.07430573]],\n",
              "      dtype=float32)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 71
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "u8RIEnvVq7Ri",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "e414114c-2609-46cf-9eb1-2972af0e61d9"
      },
      "source": [
        "# Convert the predictions with feature extraction model to classes\n",
        "model_2_preds = tf.argmax(model_2_pred_probs, axis=1)\n",
        "model_2_preds"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<tf.Tensor: shape=(30212,), dtype=int64, numpy=array([0, 1, 3, ..., 4, 4, 2])>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 73
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "hD5yvw9brOCp",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "17973c06-0a04-491c-c891-78c15d78ea1b"
      },
      "source": [
        "# Calculate results from TF Hub pretrained embeddings results on validation set\n",
        "model_2_results = calculate_results(y_true=val_labels_encoded,\n",
        "                                    y_pred=model_2_preds)\n",
        "model_2_results"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "{'accuracy': 71.39547199788163,\n",
              " 'f1': 0.7108978902117497,\n",
              " 'precision': 0.7143200501867532,\n",
              " 'recall': 0.7139547199788163}"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 74
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EL6wApSH0ltW"
      },
      "source": [
        "## Model 3: Conv1D with character embeddings\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-q-BYLq6d1me"
      },
      "source": [
        "### Creating a character-level tokenizer\n",
        "\n",
        "The [*Neural Networks for Joint Sentence Classification\n",
        "in Medical Paper Abstracts*](https://arxiv.org/pdf/1612.05251.pdf) paper mentions their model uses a hybrid of token and character embeddings.\n",
        "\n",
        "We've built models with a custom token embedding and a pretrained token embedding, how about we build one using a character embedding?\n",
        "\n",
        "The difference between a character and token embedding is that the **character embedding** is created using sequences split into characters (e.g. `hello` -> [`h`, `e`, `l`, `l`, `o`]) where as a **token embedding** is created on sequences split into tokens.\n",
        "\n",
        "![example of difference between token level and character level embeddings](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/09-token-vs-character-embeddings.png)\n",
        "*Token level embeddings split sequences into tokens (words) and embeddings each of them, character embeddings split sequences into characters and creates a feature vector for each.*\n",
        "\n",
        "We can create a character-level embedding by first vectorizing our sequences (after they've been split into characters) using the [`TextVectorization`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/TextVectorization) class and then passing those vectorized sequences through an [`Embedding`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Embedding) layer.\n",
        "\n",
        "Before we can vectorize our sequences on a character-level we'll need to split them into characters. Let's write a function to do so."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "nkoTYNvu36Bq",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 119
        },
        "outputId": "095b1c15-c958-4f93-9d74-dd98af6aca1b"
      },
      "source": [
        "# Make function to split sentences into characters\n",
        "def split_chars(text):\n",
        "  return \" \".join(list(text))\n",
        "\n",
        "# Test splitting non-character-level sequence into characters\n",
        "split_chars(random_training_sentence)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "application/vnd.google.colaboratory.intrinsic+json": {
              "type": "string"
            },
            "text/plain": [
              "'w e   d i d   a   m u l t i c e n t r e   ,   r a n d o m i s e d   ,   d o u b l e - b l i n d   ,   p h a s e   @   t r i a l   i n   w h i c h   m e n   w i t h   a t   l e a s t   o n e   b o n e   m e t a s t a s i s   f r o m   c a s t r a t i o n - r e s i s t a n t   p r o s t a t e   c a n c e r   t h a t   h a d   p r o g r e s s e d   a f t e r   d o c e t a x e l   t r e a t m e n t   w e r e   r a n d o m l y   a s s i g n e d   i n   a   @ : @   r a t i o   t o   r e c e i v e   b o n e - d i r e c t e d   r a d i o t h e r a p y   (   @   g y   i n   o n e   f r a c t i o n   )   f o l l o w e d   b y   e i t h e r   i p i l i m u m a b   @   m g / k g   o r   p l a c e b o   e v e r y   @   w e e k s   f o r   u p   t o   f o u r   d o s e s   .'"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 75
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "NyfYyWOvx2BB"
      },
      "source": [
        "Great! Looks like our character-splitting function works. Let's create character-level datasets by splitting our sequence datasets into characters."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "qLmU_GS64S2J",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "2ac92f83-5a90-43f8-e3c8-63ace14c502a"
      },
      "source": [
        "# Split sequence-level data splits into character-level data splits\n",
        "train_chars = [split_chars(sentence) for sentence in train_sentences]\n",
        "val_chars = [split_chars(sentence) for sentence in val_sentences]\n",
        "test_chars = [split_chars(sentence) for sentence in test_sentences]\n",
        "print(train_chars[0])"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "t o   i n v e s t i g a t e   t h e   e f f i c a c y   o f   @   w e e k s   o f   d a i l y   l o w - d o s e   o r a l   p r e d n i s o l o n e   i n   i m p r o v i n g   p a i n   ,   m o b i l i t y   ,   a n d   s y s t e m i c   l o w - g r a d e   i n f l a m m a t i o n   i n   t h e   s h o r t   t e r m   a n d   w h e t h e r   t h e   e f f e c t   w o u l d   b e   s u s t a i n e d   a t   @   w e e k s   i n   o l d e r   a d u l t s   w i t h   m o d e r a t e   t o   s e v e r e   k n e e   o s t e o a r t h r i t i s   (   o a   )   .\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vkLTb7FkyFPh"
      },
      "source": [
        "To figure out how long our vectorized character sequences should be, let's check the distribution of our character sequence lengths."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "4CjyFW5g47Ps",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "1547628c-0750-4e0c-efe8-22f9ea65543a"
      },
      "source": [
        "# What's the average character length?\n",
        "char_lens = [len(sentence) for sentence in train_sentences]\n",
        "mean_char_len = np.mean(char_lens)\n",
        "mean_char_len"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "149.3662574983337"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 77
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "uPTgrtVJ2DSK",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 265
        },
        "outputId": "8d3918b4-da72-4c6c-a777-aef090077cc8"
      },
      "source": [
        "# Check the distribution of our sequences at character-level\n",
        "import matplotlib.pyplot as plt\n",
        "plt.hist(char_lens, bins=7);"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYoAAAD4CAYAAADy46FuAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAWqUlEQVR4nO3df6zddZ3n8edr2wF/zEqLdBimbbZ1bNxUsrNigzVuJsY6paCxbIKmxCzVYW12xV1n1kSLJkNWJYGdyTCSKA4jHYthQZZxlkZhu13EmE0W5CLKT5EroLQBe6UIu2P8Uee9f5zPhWO9/ZTec3vuFZ6P5OR+v+/P53vO+3xz73n1++PepqqQJOlw/sl8NyBJWtgMCklSl0EhSeoyKCRJXQaFJKlr8Xw3MNdOOumkWrVq1Xy3IUm/Ue68884fVdWymcZecEGxatUqJiYm5rsNSfqNkuT7hxvz1JMkqcugkCR1GRSSpC6DQpLUZVBIkroMCklSl0EhSeo6YlAk2ZFkf5J7Zxj7UJJKclJbT5LLk0wmuTvJaUNztyZ5qD22DtVfn+Sets3lSdLqJybZ0+bvSbJ0bt6yJOloPJ8jis8Dmw4tJlkJbAR+MFQ+E1jTHtuAK9rcE4GLgDcApwMXDX3wXwG8b2i76dfaDtxSVWuAW9q6JGnMjvib2VX19SSrZhi6DPgwcONQbTNwdQ3+N6TbkixJcgrwZmBPVR0ASLIH2JTka8Arquq2Vr8aOBu4uT3Xm9vz7gS+BnzkqN7dUVq1/SvH8unn3KOXvG2+W5D0IjCraxRJNgP7qurbhwwtBx4bWt/bar363hnqACdX1eNt+Qng5E4/25JMJJmYmpo62rcjSeo46qBI8jLgo8CfzX07M2tHKIf9P1ur6sqqWldV65Ytm/FvWkmSZmk2RxS/D6wGvp3kUWAF8M0kvwvsA1YOzV3Rar36ihnqAD9sp61oX/fPoldJ0oiOOiiq6p6q+p2qWlVVqxicLjqtqp4AdgHntbuf1gNPt9NHu4GNSZa2i9gbgd1t7Jkk69vdTufx3DWPXcD03VFb+dVrIZKkMXk+t8deC/wf4DVJ9iY5vzP9JuBhYBL4G+D9AO0i9ieAO9rj49MXttucz7VtvsfgQjbAJcAfJXkIeGtblySN2fO56+ncI4yvGlou4ILDzNsB7JihPgGcOkP9SWDDkfqTJB1b/ma2JKnLoJAkdRkUkqQug0KS1GVQSJK6DApJUpdBIUnqMigkSV0GhSSpy6CQJHUZFJKkLoNCktRlUEiSugwKSVKXQSFJ6jIoJEldBoUkqcugkCR1GRSSpC6DQpLUZVBIkrqOGBRJdiTZn+TeodqfJ/lOkruT/H2SJUNjFyaZTPJgkjOG6ptabTLJ9qH66iS3t/oXkxzX6se39ck2vmqu3rQk6fl7PkcUnwc2HVLbA5xaVf8C+C5wIUCStcAW4LVtm88kWZRkEfBp4ExgLXBumwtwKXBZVb0aeAo4v9XPB55q9cvaPEnSmB0xKKrq68CBQ2r/s6oOttXbgBVteTNwXVX9rKoeASaB09tjsqoerqqfA9cBm5MEeAtwQ9t+J3D20HPtbMs3ABvafEnSGM3FNYo/Bm5uy8uBx4bG9rba4eqvBH48FDrT9V95rjb+dJv/a5JsSzKRZGJqamrkNyRJes5IQZHkY8BB4Jq5aWd2qurKqlpXVeuWLVs2n61I0gvO4tlumOQ9wNuBDVVVrbwPWDk0bUWrcZj6k8CSJIvbUcPw/Onn2ptkMXBCmy9JGqNZHVEk2QR8GHhHVf1kaGgXsKXdsbQaWAN8A7gDWNPucDqOwQXvXS1gbgXOadtvBW4ceq6tbfkc4KtDgSRJGpMjHlEkuRZ4M3BSkr3ARQzucjoe2NOuL99WVf+uqu5Lcj1wP4NTUhdU1S/b83wA2A0sAnZU1X3tJT4CXJfkk8BdwFWtfhXwhSSTDC6mb5mD9ytJOkpHDIqqOneG8lUz1KbnXwxcPEP9JuCmGeoPM7gr6tD6T4F3Hqk/SdKx5W9mS5K6DApJUpdBIUnqMigkSV0GhSSpy6CQJHUZFJKkLoNCktRlUEiSugwKSVKXQSFJ6jIoJEldBoUkqcugkCR1GRSSpC6DQpLUZVBIkroMCklSl0EhSeoyKCRJXUcMiiQ7kuxPcu9Q7cQke5I81L4ubfUkuTzJZJK7k5w2tM3WNv+hJFuH6q9Pck/b5vIk6b2GJGm8ns8RxeeBTYfUtgO3VNUa4Ja2DnAmsKY9tgFXwOBDH7gIeANwOnDR0Af/FcD7hrbbdITXkCSN0RGDoqq+Dhw4pLwZ2NmWdwJnD9WvroHbgCVJTgHOAPZU1YGqegrYA2xqY6+oqtuqqoCrD3mumV5DkjRGs71GcXJVPd6WnwBObsvLgceG5u1ttV597wz13mv8miTbkkwkmZiamprF25EkHc7IF7PbkUDNQS+zfo2qurKq1lXVumXLlh3LViTpRWe2QfHDdtqI9nV/q+8DVg7NW9FqvfqKGeq915AkjdFsg2IXMH3n0lbgxqH6ee3up/XA0+300W5gY5Kl7SL2RmB3G3smyfp2t9N5hzzXTK8hSRqjxUeakORa4M3ASUn2Mrh76RLg+iTnA98H3tWm3wScBUwCPwHeC1BVB5J8Arijzft4VU1fIH8/gzurXgrc3B50XkOSNEZHDIqqOvcwQxtmmFvABYd5nh3AjhnqE8CpM9SfnOk1JEnj5W9mS5K6DApJUpdBIUnqMigkSV0GhSSpy6CQJHUZFJKkLoNCktRlUEiSugwKSVKXQSFJ6jIoJEldBoUkqcugkCR1GRSSpC6DQpLUZVBIkroMCklSl0EhSeoyKCRJXSMFRZI/TXJfknuTXJvkJUlWJ7k9yWSSLyY5rs09vq1PtvFVQ89zYas/mOSMofqmVptMsn2UXiVJszProEiyHPiPwLqqOhVYBGwBLgUuq6pXA08B57dNzgeeavXL2jySrG3bvRbYBHwmyaIki4BPA2cCa4Fz21xJ0hiNeuppMfDSJIuBlwGPA28BbmjjO4Gz2/Lmtk4b35AkrX5dVf2sqh4BJoHT22Oyqh6uqp8D17W5kqQxmnVQVNU+4C+AHzAIiKeBO4EfV9XBNm0vsLwtLwcea9sebPNfOVw/ZJvD1X9Nkm1JJpJMTE1NzfYtSZJmMMqpp6UM/oW/Gvg94OUMTh2NXVVdWVXrqmrdsmXL5qMFSXrBGuXU01uBR6pqqqp+AXwJeBOwpJ2KAlgB7GvL+4CVAG38BODJ4foh2xyuLkkao1GC4gfA+iQva9caNgD3A7cC57Q5W4Eb2/Kutk4b/2pVVatvaXdFrQbWAN8A7gDWtLuojmNwwXvXCP1KkmZh8ZGnzKyqbk9yA/BN4CBwF3Al8BXguiSfbLWr2iZXAV9IMgkcYPDBT1Xdl+R6BiFzELigqn4JkOQDwG4Gd1TtqKr7ZtuvJGl2Zh0UAFV1EXDRIeWHGdyxdOjcnwLvPMzzXAxcPEP9JuCmUXqUJI3G38yWJHUZFJKkLoNCktRlUEiSugwKSVKXQSFJ6jIoJEldBoUkqcugkCR1GRSSpC6DQpLUZVBIkroMCklSl0EhSeoyKCRJXQaFJKnLoJAkdRkUkqQug0KS1GVQSJK6DApJUtdIQZFkSZIbknwnyQNJ3pjkxCR7kjzUvi5tc5Pk8iSTSe5OctrQ82xt8x9KsnWo/vok97RtLk+SUfqVJB29UY8oPgX8j6r658AfAA8A24FbqmoNcEtbBzgTWNMe24ArAJKcCFwEvAE4HbhoOlzanPcNbbdpxH4lSUdp1kGR5ATgD4GrAKrq51X1Y2AzsLNN2wmc3ZY3A1fXwG3AkiSnAGcAe6rqQFU9BewBNrWxV1TVbVVVwNVDzyVJGpNRjihWA1PA3ya5K8nnkrwcOLmqHm9zngBObsvLgceGtt/bar363hnqvybJtiQTSSampqZGeEuSpEONEhSLgdOAK6rqdcA/8NxpJgDakUCN8BrPS1VdWVXrqmrdsmXLjvXLSdKLyihBsRfYW1W3t/UbGATHD9tpI9rX/W18H7ByaPsVrdarr5ihLkkao1kHRVU9ATyW5DWttAG4H9gFTN+5tBW4sS3vAs5rdz+tB55up6h2AxuTLG0XsTcCu9vYM0nWt7udzht6LknSmCwecfv/AFyT5DjgYeC9DMLn+iTnA98H3tXm3gScBUwCP2lzqaoDST4B3NHmfbyqDrTl9wOfB14K3NwekqQxGikoqupbwLoZhjbMMLeACw7zPDuAHTPUJ4BTR+lRkjQafzNbktRlUEiSugwKSVKXQSFJ6jIoJEldBoUkqcugkCR1GRSSpC6DQpLUZVBIkroMCklSl0EhSeoyKCRJXQaFJKnLoJAkdRkUkqQug0KS1GVQSJK6DApJUpdBIUnqGjkokixKcleSL7f11UluTzKZ5ItJjmv149v6ZBtfNfQcF7b6g0nOGKpvarXJJNtH7VWSdPTm4ojig8ADQ+uXApdV1auBp4DzW/184KlWv6zNI8laYAvwWmAT8JkWPouATwNnAmuBc9tcSdIYjRQUSVYAbwM+19YDvAW4oU3ZCZzdlje3ddr4hjZ/M3BdVf2sqh4BJoHT22Oyqh6uqp8D17W5kqQxGvWI4q+ADwP/2NZfCfy4qg629b3A8ra8HHgMoI0/3eY/Wz9km8PVf02SbUkmkkxMTU2N+JYkScNmHRRJ3g7sr6o757CfWamqK6tqXVWtW7Zs2Xy3I0kvKItH2PZNwDuSnAW8BHgF8ClgSZLF7ahhBbCvzd8HrAT2JlkMnAA8OVSfNrzN4eqSpDGZ9RFFVV1YVSuqahWDi9Ffrap3A7cC57RpW4Eb2/Kutk4b/2pVVatvaXdFrQbWAN8A7gDWtLuojmuvsWu2/UqSZmeUI4rD+QhwXZJPAncBV7X6VcAXkkwCBxh88FNV9yW5HrgfOAhcUFW/BEjyAWA3sAjYUVX3HYN+f2Ot2v6V+W7heXv0krfNdwuSZmlOgqKqvgZ8rS0/zOCOpUPn/BR452G2vxi4eIb6TcBNc9GjJGl2/M1sSVKXQSFJ6jIoJEldBoUkqcugkCR1GRSSpC6DQpLUZVBIkroMCklSl0EhSeoyKCRJXQaFJKnLoJAkdRkUkqQug0KS1GVQSJK6DApJUpdBIUnqMigkSV0GhSSpa9ZBkWRlkluT3J/kviQfbPUTk+xJ8lD7urTVk+TyJJNJ7k5y2tBzbW3zH0qydaj++iT3tG0uT5JR3qwk6eiNckRxEPhQVa0F1gMXJFkLbAduqao1wC1tHeBMYE17bAOugEGwABcBbwBOBy6aDpc2531D220aoV9J0izMOiiq6vGq+mZb/r/AA8ByYDOws03bCZzdljcDV9fAbcCSJKcAZwB7qupAVT0F7AE2tbFXVNVtVVXA1UPPJUkakzm5RpFkFfA64Hbg5Kp6vA09AZzclpcDjw1ttrfVevW9M9Rnev1tSSaSTExNTY30XiRJv2rkoEjy28DfAX9SVc8Mj7UjgRr1NY6kqq6sqnVVtW7ZsmXH+uUk6UVlpKBI8lsMQuKaqvpSK/+wnTaifd3f6vuAlUObr2i1Xn3FDHVJ0hiNctdTgKuAB6rqL4eGdgHTdy5tBW4cqp/X7n5aDzzdTlHtBjYmWdouYm8EdrexZ5Ksb6913tBzSZLGZPEI274J+DfAPUm+1WofBS4Brk9yPvB94F1t7CbgLGAS+AnwXoCqOpDkE8Adbd7Hq+pAW34/8HngpcDN7SFJGqNZB0VV/W/gcL/XsGGG+QVccJjn2gHsmKE+AZw62x4lSaPzN7MlSV0GhSSpy6CQJHUZFJKkLoNCktRlUEiSugwKSVKXQSFJ6jIoJEldBoUkqcugkCR1GRSSpC6DQpLUZVBIkroMCklSl0EhSeoyKCRJXQaFJKnLoJAkdRkUkqQug0KS1LV4vhs4kiSbgE8Bi4DPVdUl89ySZmHV9q/MdwtH5dFL3jbfLUgLxoI+okiyCPg0cCawFjg3ydr57UqSXlwWdFAApwOTVfVwVf0cuA7YPM89SdKLykI/9bQceGxofS/whkMnJdkGbGur/y/Jg7N8vZOAH81y2/lgv8dILgV+g/pt7PfYeqH3+88ON7DQg+J5qaorgStHfZ4kE1W1bg5aGgv7Pbbs99iy32NrLvtd6Kee9gErh9ZXtJokaUwWelDcAaxJsjrJccAWYNc89yRJLyoL+tRTVR1M8gFgN4PbY3dU1X3H8CVHPn01ZvZ7bNnvsWW/x9ac9ZuqmqvnkiS9AC30U0+SpHlmUEiSugwKBn8mJMmDSSaTbJ/vfgCSrExya5L7k9yX5IOtfmKSPUkeal+XtnqSXN7ew91JTpunvhcluSvJl9v66iS3t76+2G5KIMnxbX2yja+ah16XJLkhyXeSPJDkjQt5/yb50/a9cG+Sa5O8ZCHt3yQ7kuxPcu9Q7aj3Z5Ktbf5DSbaOud8/b98Pdyf5+yRLhsYubP0+mOSMofpYPj9m6ndo7ENJKslJbX1u929VvagfDC6Sfw94FXAc8G1g7QLo6xTgtLb8T4HvMvgzJv8F2N7q24FL2/JZwM1AgPXA7fPU938C/ivw5bZ+PbClLX8W+Pdt+f3AZ9vyFuCL89DrTuDftuXjgCULdf8y+OXTR4CXDu3X9yyk/Qv8IXAacO9Q7aj2J3Ai8HD7urQtLx1jvxuBxW350qF+17bPhuOB1e0zY9E4Pz9m6rfVVzK44ef7wEnHYv+O9QdzIT6ANwK7h9YvBC6c775m6PNG4I+AB4FTWu0U4MG2/NfAuUPzn503xh5XALcAbwG+3L5JfzT0g/fsvm7f2G9sy4vbvIyx1xPaB28OqS/I/ctzf6XgxLa/vgycsdD2L7DqkA/eo9qfwLnAXw/Vf2Xese73kLF/DVzTln/lc2F6/47782OmfoEbgD8AHuW5oJjT/eupp5n/TMjyeeplRu20weuA24GTq+rxNvQEcHJbXgjv46+ADwP/2NZfCfy4qg7O0NOz/bbxp9v8cVkNTAF/206VfS7Jy1mg+7eq9gF/AfwAeJzB/rqThbt/px3t/lwI38fT/pjBv8phgfabZDOwr6q+fcjQnPZrUCxwSX4b+DvgT6rqmeGxGvyTYEHc35zk7cD+qrpzvnt5nhYzOIy/oqpeB/wDg1Mjz1pg+3cpgz+IuRr4PeDlwKZ5beooLaT9eSRJPgYcBK6Z714OJ8nLgI8Cf3asX8ugWMB/JiTJbzEIiWuq6kut/MMkp7TxU4D9rT7f7+NNwDuSPMrgr/y+hcH/I7IkyfQvdg739Gy/bfwE4Mkx9rsX2FtVt7f1GxgEx0Ldv28FHqmqqar6BfAlBvt8oe7faUe7P+d7P5PkPcDbgXe3cKPT13z2+/sM/uHw7fZztwL4ZpLf7fQ1q34NigX6Z0KSBLgKeKCq/nJoaBcwfafCVgbXLqbr57W7HdYDTw8d8h9zVXVhVa2oqlUM9uFXq+rdwK3AOYfpd/p9nNPmj+1fm1X1BPBYkte00gbgfhbo/mVwyml9kpe1743pfhfk/h1ytPtzN7AxydJ2FLWx1cYig/8o7cPAO6rqJ0NDu4At7W6y1cAa4BvM4+dHVd1TVb9TVavaz91eBjfAPMFc799jddHlN+nB4A6B7zK4e+Fj891P6+lfMThMvxv4VnucxeA88y3AQ8D/Ak5s88PgP3n6HnAPsG4ee38zz9319CoGP1CTwH8Djm/1l7T1yTb+qnno818CE20f/3cGd4Es2P0L/GfgO8C9wBcY3IGzYPYvcC2D6ye/aB9a589mfzK4NjDZHu8dc7+TDM7hT//MfXZo/sdavw8CZw7Vx/L5MVO/h4w/ynMXs+d0//onPCRJXZ56kiR1GRSSpC6DQpLUZVBIkroMCklSl0EhSeoyKCRJXf8fWfBom7qekSwAAAAASUVORK5CYII=\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": [],
            "needs_background": "light"
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pV8yNV6l1hO4"
      },
      "source": [
        "Okay, looks like most of our sequences are between 0 and 200 characters long.\n",
        "\n",
        "Let's use NumPy's percentile to figure out what length covers 95% of our sequences."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "P_k46x0Wy2n9",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "5da2a2b9-2f11-49a2-9200-ed00e4875b96"
      },
      "source": [
        "# Find what character length covers 95% of sequences\n",
        "output_seq_char_len = int(np.percentile(char_lens, 95))\n",
        "output_seq_char_len"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "290"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 79
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4dDBUHMT3QwS"
      },
      "source": [
        "Wonderful, now we know the sequence length which covers 95% of sequences, we'll use that in our `TextVectorization` layer as the `output_sequence_length` parameter.\n",
        "\n",
        "> 🔑 **Note:** You can experiment here to figure out what the optimal `output_sequence_length` should be, perhaps using the mean results in as good results as using the 95% percentile.\n",
        "\n",
        "We'll set `max_tokens` (the total number of different characters in our sequences) to 28, in other words, 26 letters of the alphabet + space + OOV (out of vocabulary or unknown) tokens."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 35
        },
        "id": "a7uKkbP_irFg",
        "outputId": "971506e4-bf56-47e0-dce0-532a99137992"
      },
      "source": [
        "# Get all keyboard characters for char-level embedding\n",
        "import string\n",
        "alphabet = string.ascii_lowercase + string.digits + string.punctuation\n",
        "alphabet"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "application/vnd.google.colaboratory.intrinsic+json": {
              "type": "string"
            },
            "text/plain": [
              "'abcdefghijklmnopqrstuvwxyz0123456789!\"#$%&\\'()*+,-./:;<=>?@[\\\\]^_`{|}~'"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 83
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "PTMInkbv4Jxi"
      },
      "source": [
        "# Create char-level token vectorizer instance\n",
        "NUM_CHAR_TOKENS = len(alphabet) + 2 # num characters in alphabet + space + OOV token\n",
        "char_vectorizer = TextVectorization(max_tokens=NUM_CHAR_TOKENS,  \n",
        "                                    output_sequence_length=output_seq_char_len,\n",
        "                                    standardize=\"lower_and_strip_punctuation\",\n",
        "                                    name=\"char_vectorizer\")\n",
        "\n",
        "# Adapt character vectorizer to training characters\n",
        "char_vectorizer.adapt(train_chars)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "u5YsVAJ25JKI"
      },
      "source": [
        "Nice! Now we've adapted our `char_vectorizer` to our character-level sequences, let's check out some characteristics about it using the [`get_vocabulary()`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/TextVectorization#get_vocabulary) method."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "uxdh7gxv5R4i",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "98ee7b46-3567-49cc-c84c-26cd7dc02e46"
      },
      "source": [
        "# Check character vocabulary characteristics\n",
        "char_vocab = char_vectorizer.get_vocabulary()\n",
        "print(f\"Number of different characters in character vocab: {len(char_vocab)}\")\n",
        "print(f\"5 most common characters: {char_vocab[:5]}\")\n",
        "print(f\"5 least common characters: {char_vocab[-5:]}\")"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Number of different characters in character vocab: 28\n",
            "5 most common characters: ['', '[UNK]', 'e', 't', 'i']\n",
            "5 least common characters: ['k', 'x', 'z', 'q', 'j']\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "sFYO0vav51zl"
      },
      "source": [
        "We can also test it on random sequences of characters to make sure it's working."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "AAcasGEh5d2O",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "11177d2b-b26a-4322-9bb3-c1e7c4f9e694"
      },
      "source": [
        "# Test out character vectorizer\n",
        "random_train_chars = random.choice(train_chars)\n",
        "print(f\"Charified text:\\n{random_train_chars}\")\n",
        "print(f\"\\nLength of chars: {len(random_train_chars.split())}\")\n",
        "vectorized_chars = char_vectorizer([random_train_chars])\n",
        "print(f\"\\nVectorized chars:\\n{vectorized_chars}\")\n",
        "print(f\"\\nLength of vectorized chars: {len(vectorized_chars[0])}\")"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Charified text:\n",
            "t h i s   a r t i c l e   d e s c r i b e s   t h e   r e s e a r c h   m e t h o d s   f o r   t h e   s h a p i n g   h e a l t h y   c h o i c e s   p r o g r a m   ,   a   m o d e l   t o   i m p r o v e   n u t r i t i o n   a n d   h e a l t h - r e l a t e d   k n o w l e d g e   a n d   b e h a v i o r s   a m o n g   s c h o o l - a g e d   c h i l d r e n   .\n",
            "\n",
            "Length of chars: 160\n",
            "\n",
            "Vectorized chars:\n",
            "[[ 3 13  4  9  5  8  3  4 11 12  2 10  2  9 11  8  4 22  2  9  3 13  2  8\n",
            "   2  9  2  5  8 11 13 15  2  3 13  7 10  9 17  7  8  3 13  2  9 13  5 14\n",
            "   4  6 18 13  2  5 12  3 13 19 11 13  7  4 11  2  9 14  8  7 18  8  5 15\n",
            "   5 15  7 10  2 12  3  7  4 15 14  8  7 21  2  6 16  3  8  4  3  4  7  6\n",
            "   5  6 10 13  2  5 12  3 13  8  2 12  5  3  2 10 23  6  7 20 12  2 10 18\n",
            "   2  5  6 10 22  2 13  5 21  4  7  8  9  5 15  7  6 18  9 11 13  7  7 12\n",
            "   5 18  2 10 11 13  4 12 10  8  2  6  0  0  0  0  0  0  0  0  0  0  0  0\n",
            "   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0\n",
            "   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0\n",
            "   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0\n",
            "   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0\n",
            "   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0\n",
            "   0  0]]\n",
            "\n",
            "Length of vectorized chars: 290\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "aT_OiBd_6j8W"
      },
      "source": [
        "You'll notice sequences with a length shorter than 290 (`output_seq_char_length`) get padded with zeros on the end, this ensures all sequences passed to our model are the same length.\n",
        "\n",
        "Also, due to the `standardize` parameter of `TextVectorization` being `\"lower_and_strip_punctuation\"` and the `split` parameter being `\"whitespace\"` by default, symbols (such as `@`) and spaces are removed.\n",
        "\n",
        "> 🔑 **Note:** If you didn't want punctuation to be removed (keep the `@`, `%` etc), you can create a custom standardization callable and pass it as the `standardize` parameter. See the [`TextVectorization`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/TextVectorization) class documentation for more.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "m8WEfkrDeNIm"
      },
      "source": [
        "### Creating a character-level embedding\n",
        "We've got a way to vectorize our character-level sequences, now's time to create a character-level embedding.\n",
        "\n",
        "Just like our custom token embedding, we can do so using the [`tensorflow.keras.layers.Embedding`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Embedding) class.\n",
        "\n",
        "Our character-level embedding layer requires an input dimension and output dimension. \n",
        "\n",
        "The input dimension (`input_dim`) will be equal to the number of different characters in our `char_vocab` (28). And since we're following the structure of the model in Figure 1 of [*Neural Networks for Joint Sentence Classification\n",
        "in Medical Paper Abstracts*](https://arxiv.org/pdf/1612.05251.pdf), the output dimension of the character embedding (`output_dim`) will be 25."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "YQHt1hSy57cu",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "f2599811-aeb4-48a9-c139-6db8779e3682"
      },
      "source": [
        "# Create char embedding layer\n",
        "char_embed = layers.Embedding(input_dim=NUM_CHAR_TOKENS, # number of different characters\n",
        "                              output_dim=25, # embedding dimension of each character (same as Figure 1 in https://arxiv.org/pdf/1612.05251.pdf)\n",
        "                              mask_zero=True,\n",
        "                              name=\"char_embed\")\n",
        "\n",
        "# Test out character embedding layer\n",
        "print(f\"Charified text (before vectorization and embedding):\\n{random_train_chars}\\n\")\n",
        "char_embed_example = char_embed(char_vectorizer([random_train_chars]))\n",
        "print(f\"Embedded chars (after vectorization and embedding):\\n{char_embed_example}\\n\")\n",
        "print(f\"Character embedding shape: {char_embed_example.shape}\")"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Charified text (before vectorization and embedding):\n",
            "t h i s   a r t i c l e   d e s c r i b e s   t h e   r e s e a r c h   m e t h o d s   f o r   t h e   s h a p i n g   h e a l t h y   c h o i c e s   p r o g r a m   ,   a   m o d e l   t o   i m p r o v e   n u t r i t i o n   a n d   h e a l t h - r e l a t e d   k n o w l e d g e   a n d   b e h a v i o r s   a m o n g   s c h o o l - a g e d   c h i l d r e n   .\n",
            "\n",
            "Embedded chars (after vectorization and embedding):\n",
            "[[[-0.04437004  0.04794488 -0.04102694 ... -0.00454418  0.03287859\n",
            "    0.03674144]\n",
            "  [ 0.0106876   0.0073402   0.0420274  ...  0.02661455 -0.00435288\n",
            "    0.02007648]\n",
            "  [ 0.04274854 -0.01991389 -0.00523685 ...  0.03240981  0.02791805\n",
            "    0.00720155]\n",
            "  ...\n",
            "  [-0.00077192  0.0400042   0.03765562 ... -0.00525742  0.03786433\n",
            "   -0.02974678]\n",
            "  [-0.00077192  0.0400042   0.03765562 ... -0.00525742  0.03786433\n",
            "   -0.02974678]\n",
            "  [-0.00077192  0.0400042   0.03765562 ... -0.00525742  0.03786433\n",
            "   -0.02974678]]]\n",
            "\n",
            "Character embedding shape: (1, 290, 25)\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bXuuUjDHPG-J"
      },
      "source": [
        "Wonderful! Each of the characters in our sequences gets turned into a 25 dimension embedding.\n",
        " "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1bzv_FmFd9bN"
      },
      "source": [
        "### Building a Conv1D model to fit on character embeddings\n",
        "Now we've got a way to turn our character-level sequences into numbers (`char_vectorizer`) as well as numerically represent them as an embedding (`char_embed`) let's test how effective they are at encoding the information in our sequences by creating a character-level sequence model.\n",
        "\n",
        "The model will have the same structure as our custom token embedding model (`model_1`) except it'll take character-level sequences as input instead of token-level sequences.\n",
        "\n",
        "```\n",
        "Input (character-level text) -> Tokenize -> Embedding -> Layers (Conv1D, GlobalMaxPool1D) -> Output (label probability)\n",
        "```\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "vVwC0xadtb5r"
      },
      "source": [
        "# Make Conv1D on chars only\n",
        "inputs = layers.Input(shape=(1,), dtype=\"string\")\n",
        "char_vectors = char_vectorizer(inputs)\n",
        "char_embeddings = char_embed(char_vectors)\n",
        "x = layers.Conv1D(64, kernel_size=5, padding=\"same\", activation=\"relu\")(char_embeddings)\n",
        "x = layers.GlobalMaxPool1D()(x)\n",
        "outputs = layers.Dense(num_classes, activation=\"softmax\")(x)\n",
        "model_3 = tf.keras.Model(inputs=inputs,\n",
        "                         outputs=outputs,\n",
        "                         name=\"model_3_conv1D_char_embedding\")\n",
        "\n",
        "# Compile model\n",
        "model_3.compile(loss=\"categorical_crossentropy\",\n",
        "                optimizer=tf.keras.optimizers.Adam(),\n",
        "                metrics=[\"accuracy\"])"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "jwdxy2gQu7Wm",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "577e7f06-53e6-4f00-c3bc-b777efd3643e"
      },
      "source": [
        "# Check the summary of conv1d_char_model\n",
        "model_3.summary()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Model: \"model_3_conv1D_char_embedding\"\n",
            "_________________________________________________________________\n",
            "Layer (type)                 Output Shape              Param #   \n",
            "=================================================================\n",
            "input_3 (InputLayer)         [(None, 1)]               0         \n",
            "_________________________________________________________________\n",
            "char_vectorizer (TextVectori (None, 290)               0         \n",
            "_________________________________________________________________\n",
            "char_embed (Embedding)       (None, 290, 25)           1750      \n",
            "_________________________________________________________________\n",
            "conv1d_1 (Conv1D)            (None, 290, 64)           8064      \n",
            "_________________________________________________________________\n",
            "global_max_pooling1d (Global (None, 64)                0         \n",
            "_________________________________________________________________\n",
            "dense_3 (Dense)              (None, 5)                 325       \n",
            "=================================================================\n",
            "Total params: 10,139\n",
            "Trainable params: 10,139\n",
            "Non-trainable params: 0\n",
            "_________________________________________________________________\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Sr9rNkxAkURZ"
      },
      "source": [
        "Before fitting our model on the data, we'll create char-level batched `PrefetchedDataset`'s."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "ixTsGYBbnXn9",
        "outputId": "28b594e5-0511-4867-8b2a-4fb41b965fba"
      },
      "source": [
        "# Create char datasets\n",
        "train_char_dataset = tf.data.Dataset.from_tensor_slices((train_chars, train_labels_one_hot)).batch(32).prefetch(tf.data.AUTOTUNE)\n",
        "val_char_dataset = tf.data.Dataset.from_tensor_slices((val_chars, val_labels_one_hot)).batch(32).prefetch(tf.data.AUTOTUNE)\n",
        "\n",
        "train_char_dataset"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<PrefetchDataset shapes: ((None,), (None, 5)), types: (tf.string, tf.float64)>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 91
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8qpv1NR_cC1h"
      },
      "source": [
        "Just like our token-level sequence model, to save time with our experiments, we'll fit the character-level model on 10% of batches."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "UGokmMdGn91w",
        "outputId": "368a0b3b-5753-4b31-d5ec-a38a554a3a39"
      },
      "source": [
        "# Fit the model on chars only\n",
        "model_3_history = model_3.fit(train_char_dataset,\n",
        "                              steps_per_epoch=int(0.1 * len(train_char_dataset)),\n",
        "                              epochs=3,\n",
        "                              validation_data=val_char_dataset,\n",
        "                              validation_steps=int(0.1 * len(val_char_dataset)))"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Epoch 1/3\n",
            "562/562 [==============================] - 4s 7ms/step - loss: 1.3984 - accuracy: 0.4280 - val_loss: 1.0557 - val_accuracy: 0.5808\n",
            "Epoch 2/3\n",
            "562/562 [==============================] - 4s 6ms/step - loss: 1.0472 - accuracy: 0.5793 - val_loss: 0.9416 - val_accuracy: 0.6277\n",
            "Epoch 3/3\n",
            "562/562 [==============================] - 3s 6ms/step - loss: 0.9438 - accuracy: 0.6305 - val_loss: 0.8741 - val_accuracy: 0.6669\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "9OHO-fl9oA5V",
        "outputId": "02611358-32e2-4165-8edb-01ed03c6496e"
      },
      "source": [
        "# Evaluate model_3 on whole validation char dataset\n",
        "model_3.evaluate(val_char_dataset)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "945/945 [==============================] - 4s 4ms/step - loss: 0.8924 - accuracy: 0.6514\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "[0.8923829197883606, 0.6514298915863037]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 93
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "8sMIB_nXJd-M"
      },
      "source": [
        "Nice! Looks like our character-level model is working, let's make some predictions with it and evaluate them."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "o0u4QzT2xMgF",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "c83335e6-5d7e-4e8b-9de0-3c2d94b2c932"
      },
      "source": [
        "# Make predictions with character model only\n",
        "model_3_pred_probs = model_3.predict(val_char_dataset)\n",
        "model_3_pred_probs"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "array([[0.3246336 , 0.25894612, 0.09072307, 0.2602581 , 0.06543914],\n",
              "       [0.17144628, 0.56041974, 0.05648327, 0.08831563, 0.12333511],\n",
              "       [0.22866122, 0.47663575, 0.08861542, 0.16486847, 0.04121907],\n",
              "       ...,\n",
              "       [0.03218718, 0.06239696, 0.27083528, 0.04975348, 0.5848271 ],\n",
              "       [0.03308214, 0.18173702, 0.27316839, 0.03617842, 0.47583404],\n",
              "       [0.44467857, 0.39148965, 0.061358  , 0.09230034, 0.01017348]],\n",
              "      dtype=float32)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 94
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "qdPUXiZux68-",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "681e3ecc-f35c-48fa-8dc3-293154ecc104"
      },
      "source": [
        "# Convert predictions to classes\n",
        "model_3_preds = tf.argmax(model_3_pred_probs, axis=1)\n",
        "model_3_preds"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<tf.Tensor: shape=(30212,), dtype=int64, numpy=array([0, 1, 1, ..., 4, 4, 0])>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 95
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "4NCDZD7cyoj7",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "03c27a56-2e91-4b5e-dda2-7fa12db3e28b"
      },
      "source": [
        "# Calculate Conv1D char only model results\n",
        "model_3_results = calculate_results(y_true=val_labels_encoded,\n",
        "                                        y_pred=model_3_preds)\n",
        "model_3_results"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "{'accuracy': 65.1429895405799,\n",
              " 'f1': 0.638714820286445,\n",
              " 'precision': 0.6460081854077261,\n",
              " 'recall': 0.651429895405799}"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 96
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1krE-3csz3N-"
      },
      "source": [
        "## Model 4: Combining pretrained token embeddings + character embeddings (hybrid embedding layer)\n",
        "\n",
        "Alright, now things are going to get spicy.\n",
        "\n",
        "In moving closer to build a model similar to the one in Figure 1 of [*Neural Networks for Joint Sentence Classification\n",
        "in Medical Paper Abstracts*](https://arxiv.org/pdf/1612.05251.pdf), it's time we tackled the hybrid token embedding layer they speak of.\n",
        "\n",
        "This hybrid token embedding layer is a combination of token embeddings and character embeddings. In other words, they create a stacked embedding to represent sequences before passing them to the sequence label prediction layer.\n",
        "\n",
        "So far we've built two models which have used token and character-level embeddings, however, these two models have used each of these embeddings exclusively.\n",
        "\n",
        "To start replicating (or getting close to replicating) the model in Figure 1, we're going to go through the following steps:\n",
        "1. Create a token-level model (similar to `model_1`)\n",
        "2. Create a character-level model (similar to `model_3` with a slight modification to reflect the paper)\n",
        "3. Combine (using [`layers.Concatenate`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Concatenate)) the outputs of 1 and 2\n",
        "4. Build a series of output layers on top of 3 similar to Figure 1 and section 4.2 of [*Neural Networks for Joint Sentence Classification\n",
        "in Medical Paper Abstracts*](https://arxiv.org/pdf/1612.05251.pdf)\n",
        "5. Construct a model which takes token and character-level sequences as input and produces sequence label probabilities as output"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "5DI2KQf7z-yo"
      },
      "source": [
        "# 1. Setup token inputs/model\n",
        "token_inputs = layers.Input(shape=[], dtype=tf.string, name=\"token_input\")\n",
        "token_embeddings = tf_hub_embedding_layer(token_inputs)\n",
        "token_output = layers.Dense(128, activation=\"relu\")(token_embeddings)\n",
        "token_model = tf.keras.Model(inputs=token_inputs,\n",
        "                             outputs=token_output)\n",
        "\n",
        "# 2. Setup char inputs/model\n",
        "char_inputs = layers.Input(shape=(1,), dtype=tf.string, name=\"char_input\")\n",
        "char_vectors = char_vectorizer(char_inputs)\n",
        "char_embeddings = char_embed(char_vectors)\n",
        "char_bi_lstm = layers.Bidirectional(layers.LSTM(25))(char_embeddings) # bi-LSTM shown in Figure 1 of https://arxiv.org/pdf/1612.05251.pdf\n",
        "char_model = tf.keras.Model(inputs=char_inputs,\n",
        "                            outputs=char_bi_lstm)\n",
        "\n",
        "# 3. Concatenate token and char inputs (create hybrid token embedding)\n",
        "token_char_concat = layers.Concatenate(name=\"token_char_hybrid\")([token_model.output, \n",
        "                                                                  char_model.output])\n",
        "\n",
        "# 4. Create output layers - addition of dropout discussed in 4.2 of https://arxiv.org/pdf/1612.05251.pdf\n",
        "combined_dropout = layers.Dropout(0.5)(token_char_concat)\n",
        "combined_dense = layers.Dense(200, activation=\"relu\")(combined_dropout) # slightly different to Figure 1 due to different shapes of token/char embedding layers\n",
        "final_dropout = layers.Dropout(0.5)(combined_dense)\n",
        "output_layer = layers.Dense(num_classes, activation=\"softmax\")(final_dropout)\n",
        "\n",
        "# 5. Construct model with char and token inputs\n",
        "model_4 = tf.keras.Model(inputs=[token_model.input, char_model.input],\n",
        "                         outputs=output_layer,\n",
        "                         name=\"model_4_token_and_char_embeddings\")"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ODM7t4aaVhcO"
      },
      "source": [
        "Woah... There's a lot going on here, let's get a summary and plot our model to visualize what's happening."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "21PRnEmK2a0Y",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "95615ef0-6cac-46ea-bc89-f59220374619"
      },
      "source": [
        "# Get summary of token and character model\n",
        "model_4.summary()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Model: \"model_4_token_and_char_embeddings\"\n",
            "__________________________________________________________________________________________________\n",
            "Layer (type)                    Output Shape         Param #     Connected to                     \n",
            "==================================================================================================\n",
            "char_input (InputLayer)         [(None, 1)]          0                                            \n",
            "__________________________________________________________________________________________________\n",
            "token_input (InputLayer)        [(None,)]            0                                            \n",
            "__________________________________________________________________________________________________\n",
            "char_vectorizer (TextVectorizat (None, 290)          0           char_input[0][0]                 \n",
            "__________________________________________________________________________________________________\n",
            "universal_sentence_encoder (Ker (None, 512)          256797824   token_input[0][0]                \n",
            "__________________________________________________________________________________________________\n",
            "char_embed (Embedding)          (None, 290, 25)      1750        char_vectorizer[1][0]            \n",
            "__________________________________________________________________________________________________\n",
            "dense_4 (Dense)                 (None, 128)          65664       universal_sentence_encoder[1][0] \n",
            "__________________________________________________________________________________________________\n",
            "bidirectional (Bidirectional)   (None, 50)           10200       char_embed[1][0]                 \n",
            "__________________________________________________________________________________________________\n",
            "token_char_hybrid (Concatenate) (None, 178)          0           dense_4[0][0]                    \n",
            "                                                                 bidirectional[0][0]              \n",
            "__________________________________________________________________________________________________\n",
            "dropout (Dropout)               (None, 178)          0           token_char_hybrid[0][0]          \n",
            "__________________________________________________________________________________________________\n",
            "dense_5 (Dense)                 (None, 200)          35800       dropout[0][0]                    \n",
            "__________________________________________________________________________________________________\n",
            "dropout_1 (Dropout)             (None, 200)          0           dense_5[0][0]                    \n",
            "__________________________________________________________________________________________________\n",
            "dense_6 (Dense)                 (None, 5)            1005        dropout_1[0][0]                  \n",
            "==================================================================================================\n",
            "Total params: 256,912,243\n",
            "Trainable params: 114,419\n",
            "Non-trainable params: 256,797,824\n",
            "__________________________________________________________________________________________________\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "EF5-v5cRSmuk",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 856
        },
        "outputId": "f11b5e74-7d69-4824-f084-47e23f7c2250"
      },
      "source": [
        "# Plot hybrid token and character model\n",
        "from keras.utils import plot_model\n",
        "plot_model(model_4)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAm8AAANHCAYAAABgvrpnAAAABmJLR0QA/wD/AP+gvaeTAAAgAElEQVR4nOzdeVxU5f4H8M8BBoZBGCARVEAUNVzA3MpMK7RbqTdLASW10jKXFrXUzCWz0sz0ijeXlqt5f+m9CmhXTVv13lKvS2oaaoprLgli7LLIAN/fH14nR7ZhmOFw8PN+veYPzznznO9zloePZ86cUUREQERERESa4KR2AURERERkPYY3IiIiIg1heCMiIiLSEIY3IiIiIg1xUbsAR9m9ezcWLlyodhlEddK9996LV199Ve0yiIjIBvX2ytuFCxewbt06tcsgqnP27NmD3bt3q10GERHZqN5eebshMTFR7RKI6pSYmBi1SyAiohqot1feiIiIiOojhjciIiIiDWF4IyIiItIQhjciIiIiDWF4IyIiItIQhjciIiIiDWF4IyIiItIQhjciIiIiDWF4IyIiItIQhjciIiIiDWF4IyIiItIQhjciIiIiDWF4IyIiItIQhjciIiIiDWF4s9KIESOg1+uhKAoKCwtVreXLL7+E0WjEF198oWodNbFnzx60adMGTk5OUBQF/v7+mD17ttplWVi/fj1atGgBRVGgKAoCAgIwbNgwtcsiIqLbnIvaBWjFypUr0bRpU8yZM0ftUiAiapdQY926dcOxY8fw6KOP4ptvvkFycjK8vb3VLstCVFQUoqKi0LJlS/z+++9ITU1VuyQiIiJeedOifv36ITs7G4899pjapaCgoADdu3dXuwy7qE99ISKi+ovhzQaKoqhdQp2xYsUKpKWlqV2GXdSnvhARUf3F8HaLVatWoUuXLtDr9fDw8EBISAjeeecd83wnJyds2bIFffr0gdFoROPGjfHpp59atLFjxw60bdsWRqMRer0e4eHh+OabbwAA77//PgwGAzw9PZGWloaJEyeiadOmSE5Otqq+nTt3Ijg4GIqiYMmSJQCAZcuWwcPDAwaDARs3bkSfPn3g5eWFwMBArFmzxvzeDz74AHq9Ho0aNcKYMWPQuHFj6PV6dO/eHXv37jUvN27cOLi6uiIgIMA87cUXX4SHhwcURcHvv/8OAJgwYQImTpyI06dPQ1EUtGzZEgDw9ddfw8vLy6aPmOtaX6qrsn0/cuRI8/1zoaGhOHjwIIDr91MaDAYYjUZs2rQJAFBSUoKZM2ciODgY7u7uiIiIQHx8PICaH0NERKRxUk/Fx8dLdbsXFxcnAGTu3LmSnp4uGRkZ8vHHH8vQoUNFRGT69OkCQLZt2yZZWVmSkZEhffv2FTc3N8nLyzO3k5iYKLNmzZKMjAxJT0+Xbt26yR133GGef6Od8ePHy+LFi2XgwIFy7Ngxq+u8cOGCAJDFixeXaXPbtm2SnZ0taWlp0rNnT/Hw8JCioiLzcqNHjxYPDw/55ZdfpLCwUI4ePSpdu3YVT09POX/+vHm5oUOHir+/v8V658+fLwDkypUr5mlRUVESGhpqsdzmzZvF09NT3n777Sr78sgjjwgAyczMrJN9EREJDQ0Vo9FYZV9Eqt73UVFR4uzsLL/99pvF+4YMGSKbNm0y/3vSpEni5uYm69atk8zMTJk2bZo4OTnJvn37LLaRLcdQdHS0REdHW7UsERHVPbzy9j8mkwlvvfUWIiMj8frrr8PX1xc+Pj547rnn0LVrV4tlu3fvDqPRCB8fH8TGxuLatWs4e/aseX50dDTefPNN+Pj4wNfXF/3790d6ejquXLli0c57772Hl156CevXr0dYWJhd+tG9e3d4eXnBz88PsbGxyMvLw/nz5y2WcXFxQZs2beDm5oa2bdti2bJlyM3NxcqVK+1SQ79+/ZCTk4M33nijRu3Uhb5UV1X7fuzYsSgpKbGoLycnB/v27UPfvn0BAIWFhVi2bBkGDBiAqKgoeHt7Y8aMGdDpdGX65YhjiIiI6jaGt/9JSkpCVlYWHnnkEYvpzs7OGD9+fIXv0+l0AK6Hv6qWKSkpsUOl1nN1dQVQeW0A0KVLFxgMBhw/frw2yrKJVvty677v1asXWrdujU8//dT8reG1a9ciNjYWzs7OAIDk5GTk5+ejffv25nbc3d0REBBQZ/pFRETqYXj7n5ycHACwy+MqtmzZggcffBB+fn5wc3PDa6+9VuM2Hc3Nza3MlUGtUrMvVe17RVEwZswYnDlzBtu2bQMAfPbZZ3juuefMy+Tl5QEAZsyYYb5HTlEUnDt3Dvn5+bXXGSIiqpMY3v6nSZMmAGC+gd1W58+fx4ABAxAQEIC9e/ciOzsb8+bNs0eJDmMymZCVlYXAwEC1S6mx2u7L9u3bERcXB8D6fT98+HDo9XosX74cycnJ8PLyQrNmzczz/fz8AABxcXEQEYvX7t27a6VfRERUdzG8/U9ISAh8fX3x7bff1qidw4cPw2Qy4YUXXkCLFi3Mv8pQl33//fcQEXTr1s08zcXFpcqPKOui2u7LgQMH4OHhAcD6fe/j44PBgwdjw4YNWLBgAZ5//nmL+UFBQdDr9Th06JBDaiYiIm1jePsfNzc3TJs2Ddu3b8e4cePw22+/obS0FLm5ufjll1+sbic4OBgAsHXrVhQWFuLkyZMWj66oC0pLS5GZmYni4mIkJSVhwoQJCA4OxvDhw83LtGzZEhkZGdiwYQNMJhOuXLmCc+fOlWnL19cXly5dwq+//orc3FyYTCZ89dVXNj8qpK71pSImkwmXL1/G999/bw5v1dn3Y8eOxbVr17B58+YyD1vW6/UYMWIE1qxZg2XLliEnJwclJSW4ePEiUlJSqruJiIiovlHxm64OZcujQkRElixZIuHh4aLX60Wv10vHjh1l6dKlMm/ePHF3dxcA0qpVKzl9+rSsXr1afHx8BIAEBgbKkSNHRERkypQp4uvrK97e3hITEyNLliwRABIaGiovvfSSuZ2goCBZtWpVtepbvHixBAQECAAxGAzSv39/Wbp0qRgMBovaPvnkE/Hy8hIA0qxZMzlx4oSIXH+8hk6nk6ZNm4qLi4t4eXnJE088IadPn7ZYT3p6ukRGRoper5fmzZvLyy+/LJMnTxYA0rJlS/OjOH766Sdp1qyZuLu7S48ePSQ1NVW+/PJL8fT0lNmzZ1fYjz179ki7du3EyclJAEhAQIDMmTOnTvXlww8/lNDQUAFQ6evzzz83r6uyfX/z40tERDp27ChTp04td/tcu3ZNpkyZIsHBweLi4iJ+fn4SFRUlR48etTgWbTmG+KgQIiJtU0TqwQ9lliMhIQGDBw+uF78Dak9jxoxBYmIi0tPT1S6lxrTel379+mHJkiVo3rx5ra43JiYGAJCYmFir6yUiIvvgx6a3odp+ZIkjaakvN38Mm5SUBL1eX+vBjYiItI/hrQ44fvy4xSMhKnrFxsaqXSrVwJQpU3Dy5EmcOHECI0aMsPjZNSIiImsxvNUBYWFhZR4JUd5r7dq1NVrPtGnTsHLlSmRnZ6N58+ZYt26dnXpQ+7TYF4PBgLCwMDz00EOYNWsW2rZtq3ZJRESkQbznjeg2w3veiIi0jVfeiIiIiDSE4Y2IiIhIQxjeiIiIiDSE4Y2IiIhIQxjeiIiIiDSE4Y2IiIhIQxjeiIiIiDSE4Y2IiIhIQxjeiIiIiDSE4Y2IiIhIQxjeiIiIiDSE4Y2IiIhIQxjeiIiIiDTERe0CHC0mJkbtEojqlD179qBbt25ql0FERDaqt1fegoKCEB0drXYZ9D+bNm3CpUuX1C6DAHTr1g333nuv2mUQEZGNFBERtYug+k9RFMTHx2PQoEFql0JERKRp9fbKGxEREVF9xPBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQaooiIqF0E1S9PPfUUDh06ZDHt119/hZ+fHzw8PMzTdDodvvjiCzRt2rS2SyQiItIsF7ULoPrnzjvvxOrVq8tMv3r1qsW/w8LCGNyIiIiqiR+bkt09+eSTUBSl0mV0Oh2GDx9eOwURERHVI/zYlByic+fOOHToEEpLS8udrygKzpw5g5CQkNotjIiISON45Y0c4umnn4aTU/mHl6IouPvuuxnciIiIbMDwRg4xePDgCq+6OTk54emnn67lioiIiOoHhjdyiICAAPTs2RPOzs7lzo+KiqrlioiIiOoHhjdymKeeeqrMNCcnJ0RGRsLf31+FioiIiLSP4Y0cJiYmptz73soLdURERGQdhjdyGC8vLzz66KNwcfnjcYLOzs54/PHHVayKiIhI2xjeyKGGDRuGkpISAICLiwv69+8Po9GoclVERETaxfBGDtW/f3+4u7sDAEpKSjB06FCVKyIiItI2hjdyKL1ej4EDBwIADAYD+vTpo3JFRERE2lbt3za9ePEidu3a5YhaqJ4KCgoCAHTt2hWbNm1SuRrSkqCgINx7771ql1FjCQkJapdAVG8MGjRI7RJUV+2fx0pISMDgwYMdVQ8RkVl0dDQSExPVLqPGqvqtXyKyHn/V04Yrbzdw41F1zJo1CzNmzLD45ilRZWJiYtQuwa7i4+N5xYCoBnjx6A+8541qBYMbERGRfTC8Ua1gcCMiIrIPhjciIiIiDWF4IyIiItIQhjciIiIiDWF4IyIiItIQhjciIiIiDWF4IyIiItIQhjciIiIiDWF4IyIiItIQhjciIiIiDWF4IyIiItIQhjciIiIiDWF4IyIiItIQ1cPbiBEjoNfroSgKCgsL1S5Hs7788ksYjUZ88cUXapcCAIiNjYWiKFa9Nm/e7LA6Ro8eDQ8PDyiKAp1Ohw4dOuDYsWMWy3z66acIDg6Goijw9/fH3//+d4fVY6va2r917Tii6pk7dy6MRiMURcGhQ4fULsdCfTi29uzZgzZt2sDJyck8XsyePVvtsiysX78eLVq0MI+vAQEBGDZsmNplkZ2pHt5WrlyJSZMmqV2G5omI2iWU8e233yIrKwsmkwkpKSkAgP79+6OoqAh5eXlIS0vD888/79AaPv74Y+zevRsA0LlzZ/z8889o06aNxTLPPvssduzYgSZNmuDixYsYPny4Q2uyRW3t37p4HJH1pk6dio8//ljtMspVH46tbt264dixY3j44YcBAMnJyZgxY4bKVVmKiorCmTNnEBoaCqPRiNTUVKxevVrtssjOVA9vt6OCggJ0797drm3269cP2dnZeOyxx+zarq0URcF9990Ho9EIFxcXi+k6nQ4GgwF+fn7o3LmzXddb3raNiIhAjx49sHfvXvz000/lvu+jjz7Cs88+C51O55AaasoR+7e8OuvacXS7csQxpLa6dGzVp+1bn/pC1qtT4U1RFLVLqBUrVqxAWlqa2mU41Jo1a2AwGKpcbvTo0fjzn/9st/VWtG1feuklAMDSpUvLzCsqKsJnn32G0aNHO7SGukYrdd6OuG8cqz5t3/rUF7JerYW3VatWoUuXLtDr9fDw8EBISAjeeeedPwpxcsKWLVvQp08fGI1GNG7cGJ9++qlFGzt27EDbtm1hNBqh1+sRHh6Ob775BgDw/vvvw2AwwNPTE2lpaZg4cSKaNm2K5ORkq+pr06YNFEWBk5MTOnfujPz8fADAa6+9Zl7fjXuhSkpKMHPmTAQHB8Pd3R0RERGIj4+3qr8TJkzAxIkTcfr0aSiKgpYtWwK4/pHCwoUL0aZNG7i5ucHHxwdPPPEEjh8/bm6zoj6uWLHCfM/WkiVLAACnTp2q8B6z7777rsp+VLY9v/76a3h5eWHOnDlWbVtrVFbL3//+dzRo0ACKosDHxwcbNmzA/v370axZMzg7O2PIkCEAUOG2Ba5/lNCkSROsXbsWWVlZFutet24d7rnnHgQGBtar/VvZ+VJenTt37iyzHmtrX7ZsGTw8PGAwGLBx40b06dMHXl5eCAwMxJo1a2pwZNx+anIMlefy5csICQmBi4sLHn30UfP0yo5ze+/P8o4ta9fxwQcfQK/Xo1GjRhgzZgwaN24MvV6P7t27Y+/eveblxo0bB1dXVwQEBJinvfjii+Z7Xn///fdKt29NxrW61pfqqmysGDlypHlsCQ0NxcGDBwFcv1/dYDDAaDRi06ZNAGz/m0I2kGqKj4+X6r4tLi5OAMjcuXMlPT1dMjIy5OOPP5ahQ4eKiMj06dMFgGzbtk2ysrIkIyND+vbtK25ubpKXl2duJzExUWbNmiUZGRmSnp4u3bp1kzvuuMM8/0Y748ePl8WLF8vAgQPl2LFjVtVYXFwsISEhEhwcLMXFxRbzXnnlFYmLizP/e9KkSeLm5ibr1q2TzMxMmTZtmjg5Ocm+ffus6m9UVJSEhoZarGPmzJni6uoqq1atkqysLElKSpJOnTpJw4YNJTU1tco+XrhwQQDI4sWLRUTk5MmT8vrrr5u3X0pKivj4+Ej37t2lpKTEqn5UtK7NmzeLp6envP3221Zt2xvrByCPP/54ufOrquWXX34Rg8EgzzzzjPk9U6dOleXLl1u0U962vWHWrFkCQBYuXGgxvUePHrJ161ara9HK/q3qfCmvzlvXY0vt27Ztk+zsbElLS5OePXuKh4eHFBUVlbtPKhMdHS3R0dHVfl9dBEDi4+OtXr4mx9CaNWsEgBw8eFBERIqKiiQqKko2btxo0Z6157+99md5x5a16xg9erR4eHjIL7/8IoWFhXL06FHp2rWreHp6yvnz583LDR06VPz9/S3WO3/+fAEgV65cqXT7Vmdce+SRRwSAZGZm1sm+iIiEhoaK0Wissi8i1o0Vzs7O8ttvv1m8b8iQIbJp0ybzv239m2ItW/JHfeXw8FZUVCTe3t4SGRlpMb24uFgWLVokIn/s0IKCAvP8zz77TADIkSNHKmz73XffFQCSlpZWYTvVceOPckJCgnlaXl6eBAcHS3Z2toiIFBQUiMFgkNjYWPMy+fn54ubmJi+88IJV/b31ZMvPz5cGDRpYtCki8uOPPwoAi8Gkoj6WNzDebMCAAaLX6+X48eNW9aOyddmisvBmTS0iIh9//LEAkNWrV8s///lPefXVV8u0VVl4S0lJEZ1OJ61bt5bS0lIREUlKSpKwsDCra9HK/i3PreeLNeGtprUvXbpUAMipU6cqrKsiDG+2HUM3hzeTySRPPvmkfPXVVxbvs/X8r8n+rCy8VbWO0aNHlwki+/btEwDy1ltvmafVNPBYq7LwVlf6Up3wdqtbx4qtW7cKAJk9e7Z5mezsbGnVqpX5Ykdt/E1hePuDwz82TUpKQlZWFh555BGL6c7Ozhg/fnyF77tx47jJZKpymZKSEjtUev3ysNFoxKJFi8zTVq9ejSeeeAJeXl4Arn+7KD8/H+3btzcv4+7ujoCAABw/ftym/h49ehRXr15Fly5dLKZ37doVrq6uFpfTbZGQkIB//etfeOutt3DnnXda1Y/aZG0to0aNQnR0NMaMGYOEhAS8//771VpPQEAAoqKicOLECWzduhUA8OGHH2Ls2LFW16KV/VseW86Xmtbu6uoKoPLzmKpmy34oKSnBkCFD0KhRI4uPSwHbz//a2J/WrqNLly4wGAy1Pl5Vh1b7cutY0atXL7Ru3Rqffvqp+VvDa9euRWxsLJydnQHUrb8ptwOHh7ecnBwAgLe3d43b2rJlCx588EH4+fnBzc0Nr732Wo3bvFmDBg0watQo7Nq1Cz/++COA63/cx40bZ14mLy8PADBjxgyL+4zOnTuH/Px8m/p74x6sBg0alJnn7e2N3Nxcm/uUnp6Ol19+GV27dsXEiROt7kdtqk4tc+bMwdWrV22+QffGFxeWLVuG3Nxc/Otf/8IzzzxjdS1a2b+Afc4XR9ZO1rNlP7z00ks4efIkPvroI/zyyy8W8+rS+V8Tbm5uuHLlitpl2IWafalqrFAUBWPGjMGZM2ewbds2AMBnn32G5557zrxMfTmmtMLh4a1JkyYAYL7B0lbnz5/HgAEDEBAQgL179yI7Oxvz5s2zR4kWxo0bB51Oh7i4OGzfvh1BQUEIDQ01z/fz8wMAxMXFQa5/7Gx+7d6926b+3ggC5Q3AWVlZ5hvpbTF+/HhkZWVh5cqV5v8hWdOP2mRtLSaTCePHj8fChQuxe/dumx6Oed9996Fjx4744osvMHfuXDz++OMwGo1W16KV/Wuv88WRtZP1bNkPgwYNwnfffQdvb288/fTTKC4uNs+rS+e/rUwmU705Bmu7L9u3b0dcXBwA68eK4cOHQ6/XY/ny5UhOToaXlxeaNWtmnl8fjiktcXh4CwkJga+vL7799tsatXP48GGYTCa88MILaNGihflXGewtMDAQgwYNwrp16/DGG29gwoQJFvODgoKg1+srfHq5Lf1t3749GjRogP3791tM37t3L4qKimx+FtqWLVvwj3/8A2+88QbatWtnnj558uQq+1GbrK3l5ZdfxvPPP49XXnkFr776Kt555x2bBoUXX3wRJSUleO+99/DCCy9Uqxat7F97nS+Oqp2qx5b9EBkZiYYNG+KTTz7BgQMHLP6zU5fOf1t9//33EBF069bNPM3FxUWTH9HXdl8OHDgADw8PANb/bfXx8cHgwYOxYcMGLFiwoMwD1uvDMaUlDg9vbm5umDZtGrZv345x48bht99+Q2lpKXJzc8tcyq9McHAwAGDr1q0oLCzEyZMna3yvUEUmTpyI4uJiZGZmolevXhbz9Ho9RowYgTVr1mDZsmXIyclBSUkJLl68iJSUFKv66+vri0uXLuHXX39Fbm4unJ2dMXHiRHz++edYvXo1cnJycPjwYYwdOxaNGze26fljOTk5GDNmDO666y68/vrrAIDCwkLs378fhw4dqrIflfnqq6/s+qgQa2pZunQpmjZtioEDBwIA3n33XbRt2xZDhw41f5QJlN225Q1+Q4YMga+vL+677z5ERERUqxat7F9rzhdrtpVer7d77VQ1ex5D/fv3x/DhwzFnzhwcOHAAgHXnXF1TWlqKzMxMFBcXIykpCRMmTEBwcLDFL6K0bNkSGRkZ2LBhA0wmE65cuYJz586Vaau8Y9/e45qafamIyWTC5cuX8f3335vDW3X+to4dOxbXrl3D5s2byzxsWYvHlKZV9xsOtn7bY8mSJRIeHi56vV70er107NhRli5dKvPmzRN3d3cBIK1atZLTp0/L6tWrxcfHRwBIYGCg+RunU6ZMEV9fX/H29paYmBhZsmSJAJDQ0FB56aWXzO0EBQXJqlWrql3jzSIjI8s8huKGa9euyZQpUyQ4OFhcXFzEz89PoqKi5OjRo1X2V0Tkp59+kmbNmom7u7v06NFDUlNTpbS0VObPny+tWrUSnU4nPj4+MmDAAElOTja3efO2urmPixcvloCAAAEgBoNB+vfvLwsWLBAA5b769u1bZT8qWpeIyJdffimenp4W3zyqSE5Ojtx///3i6+srAMTJyUlatmwpc+bMsXqbPvbYY6Ioivj6+squXbtE5PrjW5ycnASAGI1G2b9/f4XbtjyTJ0+Wf/7zn/V6/1Z2vpw/f75MnTNmzCizHhGxqvalS5eKwWCwOI8/+eQT8fLyEgDSrFkzOXHiRJXHy81u52+b2noMrV+/3jx2hoSESFpamuTk5EhQUJAAkAYNGshnn30mIpUf5/ben+Udw9VZx+jRo0Wn00nTpk3FxcVFvLy85IknnpDTp09brCc9PV0iIyNFr9dL8+bN5eWXX5bJkycLAGnZsqX5URzlbV9rxrU9e/ZIu3btzGNPQECAzJkzp0715cMPP5TQ0NAKx4cbr88//9y8rqrGipt17NhRpk6dWu72sfVvirX4bdM/KCLV+8G5hIQEDB48uF78Th0R1V0xMTEAgMTERJUrqTlFURAfH49BgwapXYomjRkzBomJiUhPT1e7lBrTel/69euHJUuWoHnz5rW+buaPP9Spn8ciIiIqj70eCVUXaKkvN38Mm5SUBL1er0pwI0v1OrwdP368wp8QuvkVGxurdqlERJrC8fX2MGXKFJw8eRInTpzAiBEjLH7WktTjonYBjhQWFsbLq0REDlBb4+u0adOwcuVKFBUVoXnz5pg/fz6io6Mdvl5H0GJfDAYDwsLC0LRpUyxduhRt27ZVuyQCwHveiKhO4j1vRHQz5o8/1OuPTYmIiIjqG4Y3IiIiIg1heCMiIiLSEIY3IiIiIg1heCMiIiLSEIY3IiIiIg1heCMiIiLSEIY3IiIiIg1heCMiIiLSEIY3IiIiIg1heCMiIiLSEIY3IiIiIg1heCMiIiLSEBdb35iQkGDPOoiILFy8eBGBgYFql2E3u3fvVrsEIk3jOfQHRUSkOm9ISEjA4MGDHVUPEZFZdHQ0EhMT1S6jxhRFUbsEonqjmrGlXqp2eCOyhaIoiI+Px6BBg9QuhYiojBtjEz9VIi3gPW9EREREGsLwRkRERKQhDG9EREREGsLwRkRERKQhDG9EREREGsLwRkRERKQhDG9EREREGsLwRkRERKQhDG9EREREGsLwRkRERKQhDG9EREREGsLwRkRERKQhDG9EREREGsLwRkRERKQhDG9EREREGsLwRkRERKQhDG9EREREGsLwRkRERKQhDG9EREREGsLwRkRERKQhDG9EREREGsLwRkRERKQhDG9EREREGsLwRkRERKQhDG9EREREGsLwRkRERKQhDG9EREREGsLwRkRERKQhDG9EREREGsLwRkRERKQhDG9EREREGsLwRkRERKQhDG9EREREGuKidgFU/3zyySfIzMwsM33jxo04e/asxbThw4fD39+/tkojIsIPP/yAPXv2WEw7fvw4AGDevHkW07t164YHHnig1mojsiBvLR8AACAASURBVIYiIqJ2EVS/jB49Gp988gnc3NzM00QEiqKY/11cXAyj0YjU1FTodDo1yiSi29R3332Hhx9+GDqdDk5O5X8AVVpaCpPJhG+//RZ/+tOfarlCosoxvJHdff/994iMjKx0GZ1Oh1GjRmHJkiW1VBUR0XUlJSXw9/dHenp6pcv5+PggLS0NLi78kIrqFt7zRnZ3//33o1GjRpUuYzKZ8OSTT9ZSRUREf3B2dsbQoUPh6upa4TKurq546qmnGNyoTmJ4I7tzcnLCsGHDKh0YGzdujO7du9diVUREf3jyySdRVFRU4fyioiL+B5PqLIY3cojKBkadToenn37a4h44IqLa1K1bNwQHB1c4PzAwEPfcc08tVkRkPYY3coguXbqgefPm5c7jR6ZEVBcMGzas3C9Mubq64plnnuF/MKnOYngjh3n66afLHRhbtGiBDh06qFAREdEfhg0bBpPJVGZ6UVERYmNjVaiIyDoMb+Qw5Q2MOp0OI0aMUKkiIqI/tGnTBm3atCkzPSwsDO3bt1ehIiLrMLyRw7Rs2RLh4eEWHz2YTCYMHjxYxaqIiP5w6ycEOp0OzzzzjIoVEVWN4Y0c6umnn4azszMAQFEUdOzYEa1atVK5KiKi64YMGYLi4mLzv4uLi/mRKdV5DG/kUEOGDEFJSQmA689W4v9oiaguCQ4ORpcuXeDk5ARFUdC1a1eEhISoXRZRpRjeyKGaNGmC7t27Q1EUlJaWIiYmRu2SiIgsPP3003BycoKzszOeeuoptcshqhLDGzncU089BRHB/fffjyZNmqhdDhGRhcGDB0NEICL8DyZpg2hUfHy8AOCLL74c8IqOjnbYuRsdHa16//jiiy++tPIqbzzW/I+2xcfHq10CWeEvf/kLRo8ejQYNGqhdClUhLi7O4evo1q0bXnnlFYevh8haP/zwAxRFwf333692KURmFY3Hmg9vgwYNUrsEskL37t0RGBiodhlkhcTERIevIzAwkOcu1SmPPvooAMDLy0vlSoj+UNF4rPnwRtrA4EZEdRlDG2kJv7BAREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb/8zYsQI6PV6KIqCwsJCtctxiK5du8LZ2Rl33XWX3dseOXIkPD09oSgKDh06VK33rl+/Hi1atICiKBW+QkJC7FJnXdgGFS335Zdfwmg04osvvrB7bWRJS9t6wYIFaNSoERRFwUcffaR2OaSSmoyxtrp1bA4KCsKKFSvM83/44Qc0bdoUiqIgICAAn3zySa3UZU2tAQEBGDZsmGr1OBrD2/+sXLkSkyZNUrsMh9q3bx8iIyMd0vby5cvxt7/9zab3RkVF4cyZMwgNDYXRaISIQERQXFyM/Px8XL58GQaDwS511oVtUNFyIuKIsqgcWtrWkyZNwq5du9Qug1RWkzHWVreOzRcuXMBzzz1nnn///fejb9++GDVqFFJSUjBq1Khara+yWlNTU7F69WrV6nE0F7ULoNqnKIraJVjF2dkZ7u7ucHd3R+vWre3adl3cBv369UN2drbaZdwWuK3tq6CgAL1792bIvI2UlpZi5MiR0Ov1WLp0aZ0cU+szXnkrR30/CHU6nUPadeR227Bhg13bU3sb1MYxJiJITExU9aMMuj2sWLECaWlpapdR79WVv02lpaV49tlnYTAYsGzZsjpT1+3ktgtvq1atQpcuXaDX6+Hh4YGQkBC888475vlOTk7YsmUL+vTpA6PRiMaNG+PTTz+1aGPHjh1o27YtjEYj9Ho9wsPD8c033wAA3n//fRgMBnh6eiItLQ0TJ05E06ZNkZycbHWNJSUlmDlzJoKDg+Hu7o6IiAjEx8cDABYtWgQPDw84OTmhc+fO8Pf3h06ng4eHBzp16oSePXsiKCgIer0e3t7eeO2118q0f+rUKYSFhcHDwwPu7u7o2bMndu7caXUNwPVgMH/+fNx5551wc3OD0WjE5MmTy6zr66+/hpeXF+bMmWN1/6uitW1gzXI7d+5EcHAwFEXBkiVLAADLli2Dh4cHDAYDNm7ciD59+sDLywuBgYFYs2ZNmVrfffdd3HnnnXB3d0fDhg3RvHlzvPvuuxg0aJDN27ouGDduHFxdXREQEGCe9uKLL8LDwwOKouD3338HYP32Km9bt2nTBoqimI+p/Px8AMBrr71mPs///ve/A6j8uKjs/P/hhx9w9913w2AwwMvLC+Hh4cjJyQFQ+ZhSU5Wtt7K+WLs9J0yYgIkTJ+L06dNQFAUtW7a0W9s3VDZuV3WeWqu26hURLFy4EG3atIGbmxt8fHzwxBNP4Pjx4xZtWDu+2Ho82jo2l5aWYvjw4TAajebzp7rbs7K6qjoXKjueq6uydY0cOdJ8/1xoaCgOHjwI4Pr98QaDAUajEZs2bapRX2tENCo+Pl6qW35cXJwAkLlz50p6erpkZGTIxx9/LEOHDhURkenTpwsA2bZtm2RlZUlGRob07dtX3NzcJC8vz9xOYmKizJo1SzIyMiQ9PV26desmd9xxh3n+jXbGjx8vixcvloEDB8qxY8esrnPSpEni5uYm69atk8zMTJk2bZo4OTnJvn37RETkzTffFACyd+9eycvLk99//10effRRASBbtmyRK1euSF5enowbN04AyKFDh8xt9+7dW1q0aCFnz54Vk8kkR44ckXvuuUf0er2cOHHC6hqmT58uiqLIX/7yF8nMzJT8/HxZunSpAJCDBw+a29m8ebN4enrK22+/XWW/Q0NDxWg0WkwbP368HD58uMyyWtoG1i534cIFASCLFy+2eO+NYzI7O1vS0tKkZ8+e4uHhIUVFRebl5syZI87OzrJx40bJz8+XAwcOiL+/vzz44INVbvdbRUdHS3R0dLXf58j2hw4dKv7+/hbT5s+fLwDkypUr5mnWbq9bt3VxcbGEhIRIcHCwFBcXW6znlVdekbi4OPO/rTkubj3/9+/fL15eXjJv3jwpKCiQ1NRUGThwoLn2qsaUkydPCgD58MMPq7Xdrl69Wul6re1LVdszKipKQkNDLdZtr7arGrerWo+1aqvemTNniqurq6xatUqysrIkKSlJOnXqJA0bNpTU1FRzO9aOG7Ycj8eOHbNpbC4uLpahQ4eKTqeT5ORku2zPW+uq7Fyo6ni+uVZrVHXeRUVFibOzs/z2228W7xsyZIhs2rSpxn21RkXj5W0T3oqKisTb21siIyMtphcXF8uiRYtE5I8NXFBQYJ7/2WefCQA5cuRIhW2/++67AkDS0tIqbMdaBQUFYjAYJDY21jwtPz9f3Nzc5IUXXhCRP4JLbm6ueZn/+7//EwAWQefHH38UALJ27VrztN69e0uHDh0s1pmUlCQAZNKkSVbVkJ+fLwaDQf70pz9ZtLNmzZoyA0t1hIaGCoAyr8rCW13fBtXZVpWFt5uPpRsD+KlTp8zTunbtKnfffbfFOkaNGiVOTk5y7dq1MtuvMvUhvFW1vcrb1jf+6CYkJJin5eXlSXBwsGRnZ4uIdedneTUcOXJEAMjmzZut6u+tY4qt4a2y9dral/K2563hzV5tVzVuW7Mea9RWvfn5+dKgQQOL9Yj8MU7dCFLWjhu21l1doaGh4unpKU8++aR06tRJAEi7du3k6tWr5S5vz7puPhesOY+qE94qW5eIyNatWwWAzJ4927xMdna2tGrVyvyfPEfvg4rGy9vmY9OkpCRkZWXhkUcesZju7OyM8ePHV/i+G/dGmUymKpcpKSmpcZ3JycnIz89H+/btzdPc3d0REBBQ5rL6zVxdXQEAxcXFZeqqrHYACA8Ph9FoRFJSklU1nDp1Cvn5+ejdu3f1O1iFm79tKiKV7ptb1cVt4IhtdaOfN/epsLCwzDcoS0pKoNPp4OzsbLd1a1F526s8I0eOhNFoxKJFi8zTVq9ejSeeeAJeXl4AbD8/W7RogUaNGmHYsGGYNWsWfv3110prsdeYUtl6azrWVLY97dV2VeO2retRq96jR4/i6tWr6NKli8X8rl27wtXVFXv37gVg/bhhr/5bIz8/Hw888AAOHDiAAQMG4OjRoxg5cqTD67r5XKjueVRdt553vXr1QuvWrfHpp5+ax9e1a9ciNjbWPK7W5j642W0T3m58Ju7t7V3jtrZs2YIHH3wQfn5+cHNzK/eeKlvl5eUBAGbMmGHxnLNz586Z78NxBJ1OZx6Aqqrh4sWLAAA/Pz+H1XPDokWLLE4KR3LENqitbdW3b18cOHAAGzduREFBAfbv348NGzbgz3/+820f3qzVoEEDjBo1Crt27cKPP/4IAPjwww8xbtw48zK2np/u7u7497//jR49emDOnDlo0aIFYmNjUVBQAMBxY0pl63XkWGOvtqsat+21ntqqNysrC8D1Y+1W3t7eyM3NBWD9uFGbfy8aNGiA0aNHA7j+aK0WLVpg7dq1iIuLs2tdlZ0LVZ1H1VXVeacoCsaMGYMzZ85g27ZtAIDPPvvM4nEpav3Nvm3CW5MmTQDAfHOzrc6fP48BAwYgICAAe/fuRXZ2NubNm2ePEgH8cbLGxcVZXIESEezevdtu67lZcXExMjIyEBwcbFUNer0eAHDt2jWH1KMGR22D2tpWs2bNQq9evTB8+HB4eXlh4MCBGDRoUK0/F0rrxo0bB51Oh7i4OGzfvh1BQUEIDQ01z6/J+dmuXTt88cUXuHTpEqZMmYL4+HgsWLDA4WNKRet15Fhjr7arGrfttZ7aqvdGqLsR0m6WlZWFwMBAANaPG2r8vQCuf0KSmJhoDjzbt2+3S13WnAsVHc/W2L59uzlsWnveDR8+HHq9HsuXL0dycjK8vLzQrFmzGve1pm6b8BYSEgJfX198++23NWrn8OHDMJlMeOGFF9CiRQvzrzLYy41vSdbWE7QB4D//+Q9KS0vRqVMnq2po3749nJyc8MMPP9RajSkpKRgxYoTD2nfUNqitbXX06FGcPn0aV65cgclkwvnz57Fs2TL4+Pg4dL21xcXFpcqPPe0hMDAQgwYNwrp16/DGG29gwoQJFvNtPT8vXbqEX375BcD1wX7u3Lno1KkTfvnlF4eOKZWt15Fjjb3armrcttd6aqve9u3bo0GDBti/f7/F9L1796KoqAidO3c2L2fNuKHG34sbOnXqhLi4OBQXF2PQoEG4dOlSjeuq6lyo7Hi2xoEDB+Dh4WHVum7w8fHB4MGDsWHDBixYsADPP/+8xXy19sFtE97c3Nwwbdo0bN++HePGjcNvv/2G0tJS5ObmWr3jAZivzGzduhWFhYU4efKk+T4Fe9Dr9RgxYgTWrFmDZcuWIScnByUlJbh48SJSUlLsso6ioiJkZ2ejuLgYP/30E8aNG4dmzZph+PDhVtXg5+eHqKgorFu3DitWrEBOTg6SkpLKfZ7YV199VaNHhYgICgoKsH79evN9R/ZQW9ugOtuqJl566SUEBwfj6tWrdm23rmjZsiUyMjKwYcMGmEwmXLlyBefOnXPIuiZOnIji4mJkZmaiV69eFvNsPT8vXbqEMWPG4Pjx4ygqKsLBgwdx7tw5dOvWzaFjSmXrtedY4+vri0uXLuHXX39Fbm4unJ2d7dJ2VeO2vfpgr3asqXfixIn4/PPPsXr1auTk5ODw4cMYO3YsGjdubP5Y0tpxoyZ113RsBoCxY8fiySefxOXLlxETE2P+D5atdVV1LlR2PFfGZDLh8uXL+P77783hrTrn3dixY3Ht2jVs3rwZjz32mMW82vibXa5qf/WhjrDlUSEiIkuWLJHw8HDR6/Wi1+ulY8eOsnTpUpk3b564u7sLAGnVqpWcPn1aVq9eLT4+PgJAAgMDzd84nTJlivj6+oq3t7fExMTIkiVLBICEhobKSy+9ZG4nKChIVq1aVe0ar127JlOmTJHg4GBxcXERPz8/iYqKkqNHj8qiRYvEYDAIAAkJCZEdO3bIe++9J0ajUQCIv7+//OMf/5C1a9eKv7+/ABAfHx9Zs2aNiIisXLlSIiMjpVGjRuLi4iJ33HGHPPnkk3Lu3DmraxARyc3NlZEjR8odd9whDRo0kB49esjMmTPN2+rnn38WEZEvv/xSPD09Lb6tc6vPP/+8wm+a3vyaMWOGiIjmtoE1yy1evFgCAgIEgBgMBunfv78sXbrU3M8bx+Qnn3wiXl5eAkCaNWtmfrTJv//9b7njjjsstpdOp5M2bdrI+vXrq3X81cVvm6anp0tkZKTo9Xpp3ry5vPzyyzJ58mQBIC1btpTz589bvb3K29a3ioyMlOXLl5dbS2XHxc3jyM3n/6+//irdu3cXHx8fcXZ2liZNmsj06dPN31irbEyZMGGC+Tj28PCQgQMHWr3dqlpvZX2pzvH3008/SbNmzcTd3V169OghqampdmtbpOJxu6o+VEdt1VtaWirz58+XVq1aiU6nEx8fHxkwYECZR29YO77YcjyK2DY2BwYGyrRp08rUeeeddwoAadSokaxYsaJGdVV2LuzYsaPC49navyOff/65Ves6f/68RT87duwoU6dOrfaxU1lfrVHReKmIaOhH/m6SkJCAwYMHa+o3CokcZdmyZTh58qTFzcNFRUV4/fXXsWzZMmRmZsLd3d2qtmJiYgAAiYmJDqnV0e0TEdlbv379sGTJEjRv3rxW11vReMnfNiXSuNTUVIwbN67MPReurq4IDg6GyWSCyWSyOrwREd3uTCaT+dEhSUlJ0Ov1tR7cKnPb3POmpuPHj1t8hbiiV2xsrNqlkga5u7tDp9NhxYoVuHz5MkwmEy5duoTly5dj5syZiI2Ntev9gqQujifW4XaimpgyZQpOnjyJEydOYMSIERY/o1kX8MpbLQgLC+PHu+QwRqMR3377Ld5++220bt0aeXl5aNCgAdq1a4f33nsPo0aNUrtEsiOOJ9bhdqKaMBgMCAsLQ9OmTbF06VK0bdtW7ZIsMLwR1QM9e/bEd999p3YZRET1wuzZszF79my1y6gQPzYlIiIi0hCGNyIiIiINYXgjIiIi0hCGNyIiIiINYXgjIiIi0hCGNyIiIiINYXgjIiIi0hCGNyIiIiINYXgjIiIi0hCGNyIiIiINYXgjIiIi0hCGNyIiIiINYXgjIiIi0hAXtQuoKUVR1C6BqN6Jjo52aPvr1q3juUtEZIXyxmNFRESFWmrs4sWL2LVrl9plkI1WrFiBc+fO4e2331a7FCpHUFAQ7r33Xoe0vXv3bly4cMEhbRNVR0FBAYYPH46pU6firrvuUrsconKVNx5rNryRtn300UeYMmUKsrKyeAWGiFSRnJyMsLAwHDx4kOGNNIX3vJEqwsPDkZOTg3PnzqldChHdplJTUwEAjRs3VrkSoupheCNVREREQFEUHD58WO1SiOg2lZKSAmdnZzRs2FDtUoiqheGNVOHp6YlmzZohKSlJ7VKI6DaVmpqKRo0awdnZWe1SiKqF4Y1UExERwStvRKSa1NRUfmRKmsTwRqoJDw9neCMi1aSkpCAgIEDtMoiqjeGNVBMeHo4TJ06gsLBQ7VKI6DaUkpLCK2+kSQxvpJqIiAgUFxfj2LFjapdCRLeh1NRUXnkjTWJ4I9W0bt0a7u7u/NICEamCV95IqxjeSDXOzs4ICwvjfW9EVOtMJhMyMjJ45Y00ieGNVMVvnBKRGi5fvozS0lJeeSNNYngjVYWHh/NjUyKqdSkpKQDAK2+kSQxvpKqIiAikpqYiLS1N7VKI6DZy46exGN5IixjeSFXh4eEAgCNHjqhcCRHdTlJSUmA0GmEwGNQuhajaGN5IVQEBAWjUqBE/OiWiWsXHhJCWMbyR6vhLC0RU2/jTWKRlDG+kOn5pgYhqG38ai7SM4Y1UFx4ejqNHj6KkpETtUojoNsErb6RlDG+kuoiICBQUFOD06dNql0JEtwleeSMtY3gj1bVr1w7Ozs786JSIaoWI4PLlywxvpFkMb6Q6d3d3tGzZkl9aIKJakZmZicLCQn5sSprF8EZ1Ar9xSkS15cYDehneSKsY3qhO4DdOiai28KexSOsY3qhOiIiIwJkzZ5Cbm6t2KURUz6WmpkKn08HX11ftUohswvBGdUJERAREBEePHlW7FCKq525809TJiX8CSZt45FKd0Lx5c3h5efG+NyJyOP40FmkdwxvVCYqioG3btgxvRORwfEAvaR3DG9UZERER/NICETkcH9BLWsfwRnUGHxdCRLWBV95I6xjeqM4IDw9HRkYGLl68qHYpRFSP8cobaR3DG9UZERERAMCrb0TkMNeuXUNWVhbDG2kawxvVGT4+PggMDOR9b0TkMKmpqRARfmxKmsbwRnVKREQEr7wRkcPw1xWoPnBRuwCim4WHh+Orr77C2bNncfjwYRw+fBg///wzmjRpgkWLFqldHhFpyPnz5/G3v/0NjRo1QpMmTRAQEICkpCQoigJ/f3+1yyOymSIionYRdPvKysrCkSNHzCFt586dOHXqFK5duwYA0Ov1uHbtGiZNmoT3339f5WqJSEsKCwthNBpRXFyM0tJSi3nu7u4ICAiAv78/QkJC0LVrV7z66qsqVUpUPbzyRqrJyMhASEgIcnNz4eLiAkVRYDKZLJYpLCyEs7Mz2rRpo1KVRKRVer0eXbt2xa5du8rMKygowNmzZ3H27Fns2bMH3bt3V6FCItvwnjdSja+vL6ZNmwYnJycUFxeXCW43lJSUoG3btrVcHRHVB7169YJOp6t0GS8vLzz77LO1VBFRzTG8kapeffVVBAYGVvkD0WFhYbVUERHVJw888ACKiooqnK/T6fDKK6/Aw8OjFqsiqhne80aqW7t2LYYMGYKKDkU/Pz+kpaXVclVEVB/k5+eb73srj5ubGy5cuAA/P79arozIdrzyRqobPHgw7r77bri4lH8LZrt27Wq5IiKqLwwGAzp27FjuPJ1Oh+eff57BjTSH4Y1UpygK/vrXv6KkpKTMPFdXV/MvLxAR2eKhhx6Cq6trmeklJSWYMGGCChUR1QzDG9UJ99xzD2JiYsrcWCwi/KYpEdVIefe96XQ6xMTEIDQ0VKWqiGzHe96ozrhw4QJatWplfsbbDT/88APuv/9+laoiIq27evUqvL29y1zdP3DgADp16qRSVUS245U3qjOCgoLwyiuvlLn3jVfeiKgmGjRoYHH7hYuLCyIjIxncSLMY3qhOmTZtGoxGIxRFAXD9+Uu8mZiIaurm+96Ki4sxbdo0lSsish3DG9Upnp6emDNnjjm88eG8RGQPN+57c3JyQvv27fHQQw+pXRKRzRjeqM4ZOXIkWrduDQDo0KGDytUQUX3Qo0cPODk5obS0FNOnT1e7HKIaKfOFhd27d2PhwoVq1UMEALh8+TJ27NiBDh06oFWrVmqXQ1SlV199Fffee68q6+a4bZ3vvvsORUVF6Nu3r/nqPgH33nsvXn31VbXLoGooc+XtwoULWLdunRq1EJn5+/sjICAAXl5eapdCVKV169bhwoULqq2f47Z1GjVqhDvvvJPB7SZ79uzB7t271S6Dqqn8R9oDSExMrM06iMo4evQovL290bRpU7VLIapUXQkDHLcrt2PHDnTu3BkGg0HtUuqMmJgYtUsgG1QY3ojUxp/FIiJ76tmzp9olENkFv7BAREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCEMb0REREQawvBGREREpCE1Dm9du3aFs7Mz7rrrriqX/fLLL2E0GvHFF19UuMzIkSPh6ekJRVFw6NChar3XkdRe/4IFC9CoUSMoioKPPvqo3GW2bt2KqVOnWrWsI23atAnz5s1DSUmJTe9fv349WrRoAUVRLF4uLi5o2LAhHnroIXz++edl3sfjy3bVOb5u3T8BAQEYNmxYlev4+eefERsbi+bNm8PNzQ0NGzZEhw4dMHv2bPMysbGxZfZ7Ra/NmzeXqeWNN96otIaFCxdCURQ4OTkhLCwM27dvr/HxWl9UdG5oWWFhIcLCwjBjxoxqv7eiccjV1RWNGjXCgw8+iPnz5yMzM9MBlRNVrsbhbd++fYiMjLRqWRGpcpnly5fjb3/7m03vdSS11z9p0iTs2rWrwvlvvvkmPvjgA0ybNq3KZR2tf//+0Ov16N27N7Kysqr9/qioKJw5cwahoaEwGo0QEYgIrly5gvj4ePz222+IiopCfHy8xft4fNmuOsfXrfsnNTUVq1evrrT9w4cPo3v37ggICMB//vMfZGdnY9euXXj00Ufx/fffWyz77bffIisrCyaTCSkpKQCuH1NFRUXIy8tDWloann/+eQCWxwpwff+aTKZyaygpKcEHH3wAAOjVqxeOHz+O+++/v8bHa31R0bmhZdOnT0dycrJN7y1vHCotLUVaWhoSEhLQvHlzTJkyBe3atcP+/fvtXDlR5ez2samiKFUu069fP2RnZ+Oxxx6rdvs1eW91FRQUoHv37qqtv7ree+89rF27FgkJCfD09LSpjfL6XBPjx49Hhw4d0LdvXxQXF9ulTR8fH/Tu3Rt//etfAQAJCQkW83l8OYY9jq8FCxbA29sbixYtQkhICPR6PVq3bo133nkH7u7u5uUURcF9990Ho9EIFxcXi+k6nQ4GgwF+fn7o3LlzmXV07twZqamp2LBhQ7k1rF+/Hk2bNi13niOOV1LXrl27cOTIEbu2qSgKvL298eCDD2LlypVISEjA5cuXzecvUW2xW3jT6XT2asqqIOhIK1asQFpamqo1WOvUqVN444038NZbb0Gv19vcjiP6PGvWLBw6dAiLFi2ya7shISEAYPNVEh5f1rPX8ZWeno7s7GxkZGRYTHd1dbX4qHjNmjUwGAxVtjd69Gj8+c9/tpj2wgsvAAA+/PDDct+zP45pagAAIABJREFUcOFCTJw4scI2HXW8aona54a9FBQUYPLkyQ7fl9HR0Rg+fDjS0tJUuUWFbl92C2+nTp1CWFgYPDw84O7ujp49e2Lnzp3m+Tt37kRwcDAURcGSJUvM00UE8+fPx5133gk3NzcYjUZMnjzZou3y3vv+++/DYDDA09MTaWlpmDhxIpo2bYrk5GSUlJRg5syZCA4Ohru7OyIiIsp8xLZq1Sp06dIFer0eHh4eCAkJwTvvvIMJEyZg4sSJOH36NBRFQcuWLSutfeHChWjTpg3c3Nzg4+ODJ554AsePHzcvs2zZMnh4eMBgMGDjxo3o06cPvLy8EBgYiDVr1ljUtGPHDrRt2xZGoxF6vR7h4eH45ptvKt3uH3zwAUQE/fv3r3If/fDDD7j77rthMBjg5eWF8PBw5OTklNvnRYsWwcPDA05OTujcuTP8/f2h0+ng4eGBTp06oWfPnggKCoJer4e3tzdee+21Muvz8fHBAw88gEWLFpk/Fvz666/h5eWFOXPmVFlvRZKSkgAADzzwgHkajy/1j6/KdO3aFXl5eejVqxf++9//1qitivTq1Qtt2rTBf/7znzIflf33v/9Ffn4+Hn744QrfX97xWp9Zc24AqPR4r87xV9H4U9U6bDF9+nS8+OKL8PPzK3e+PcahG4YPHw4A+Oqrr8zTtLjNSGPkFvHx8VLO5Er17t1bWrRoIWfPnhWTySRHjhyRe+65R/R6vZw4ccK83IULFwSALF682Dxt+vTpoiiK/OUvf5HMzEzJz8+XpUuXCgA5ePBgle8FIOPHj5fFixfLwIED5dixYzJp0iRxc3OTdevWSWZmpkybNk2cnJxk3759IiISFxcnAGTu3LmSnp4uGRkZ8vHHH8vQoUNFRCQqKkpCQ0Mt+lje+mfOnCmurq6yatUqycrKkqSkJOnUqZM0bNhQUlNTy9S5bds2yc7OlrS0NOnZs6d4eHhIUVGRebnExESZNWuWZGRkSHp6unTr1k3uuOMO8/yTJ08KAPnwww/N01q0aCFt27Yts09uXfbq1avi5eUl8+bNk4KCAklNTZWBAwfKlStXKuzzm2++KQBk7969kpeXJ7///rs8+uijAkC2bNkiV65ckby8PBk3bpwAkEOHDpWpY+rUqRb7cvPmzeLp6Slvv/12mWVvFRoaKkaj0fzv/Px8+eqrr6RZs2by8MMPy9WrVy2W5/FVe8dXefunMvn5+dKlSxcBIACkbdu2Mm/ePElPT6/0fSkpKQJAHn/88UqXCw0NlbNnz8pf//pXASATJkywmD9gwABZuXKl5ObmCgDp3bt3ue3cerxaC4DEx8dX6z32ZMu4be25UdXxbs3xV9X4U9U6qmPnzp3Sv39/ERG5cuWKAJDp06dbLFOTcehWOTk5AkCCgoLM07S0zaKjoyU6Orpa7yH12S28dejQwWJaUlKSAJBJkyaZp936Byo/P18MBoP86U9/snjvmjVrqvXHtaCgwDytoKBADAaDxMbGmqfl5/8/e3ceH1V973/8PVknIRtLECRhR6jsCBYhWPihtUhFMImACAVLG8AWrBRxu1yKUEtRuVSxFqXcC/RCAnqRWsBWq9gqIMgSQFktIFIIAiGBhKyf3x99MDVmIQlJTk7yej4e8wdnvt/v+czJd2benG2yLDg42KZMmWK5ubkWFRVlgwYNKrLO/Px8+6//+i8zK9+Xa1ZWloWFhRVZj5nZxx9/bJKKfCiUVOfVD8kjR44U255X/fKXvzRJlpaWZmYlBzKPx2P33HNPsb7fbLtv3z6TZG+99VaJ6yorvGVmZvqW/c///I9Jsr179xZ7zatXry427u9//3uTZMuXLy/1dZamXbt2vi/7rz+6du1q//M//2M5OTlF2jO/am5+mVUsvJmZ5ebm2qJFi6xTp06+v2XTpk3t/fffL7VPRcNbenq6NWjQwBo2bGhZWVlmZnb06FGLiYmxnJyca4a3ys5Xt4W38r43rjXfzco3/8r6/CnPOiryunr37m0nT540s9LDW0WUZ557PB6LiooyM/dtM8KbO1Xbfd66du2qyMhI3yGukhw5ckRZWVkaPHhwla334MGDysrKUpcuXXzLQkJC1KxZMx04cECpqalKT0/XXXfdVaSfv7+/pk2bVu717N+/X5cuXVLv3r2LLO/Tp4+CgoK0bdu2MvsHBQVJUqlXxkn/Po+wtFsYpKWlyczKdY5Q27Zt1bRpUz344IOaPXu2jh07ds0+Jbla99dP6r5aZ0mv5WptZ86cqdT6vn61aV5enk6ePKmf/exnmjp1qrp166avvvqq1L7Mr5qbX+URGBioqVOn6rPPPtPWrVs1fPhwpaWlKTExscputxAZGakHHnhAFy5c0OrVqyVJCxcu1JQpU3zbpCzXO1/dorzvjWvN99J8c/6V9flT2XWU5Mknn9SPf/zjUi9MqQ6XL1+WmSkiIkKS+7YZ3Klab9IbGBhY5pfHyZMnJanU8xIq4/Lly5Kkp59+usi9eY4fP66srCzf+QJRUVHXtZ6rJ8uHhYUVey4qKkqZmZkVHvNPf/qTBg4cqOjoaAUHB5d4HtnXXblyRZIUHBx8zbFDQkL017/+VXFxcZo3b57atm2rUaNGKTs7u8J1VsTVKwmv1no9AgIC1KJFC02YMEHPPfecDh48qGeffbbU9syvoqpzflXUt7/9bf3f//2fJk+erLNnz+q9996rsrGvXrjwyiuvKD09XWvWrNGkSZPK1bcq52ttVt73xrXme3mV9flTVev4+9//rr1792rixInl7lMVDh06JEnq1KmTJHdtM7hXtYW3/Px8nT9/Xi1btiy1zdWr13JycqpsvVc/jBYuXOjbY3P1sWXLFt14442SVOYem/K4+uVc0pdoenq6YmJiKjTeiRMnNGLECDVr1kzbtm3TxYsXNX/+/DL7XP2iKe/NRTt37qw//vGPOnXqlGbOnKnk5GQ999xzFaqzonJzcyWpyO0gqkLXrl0lSZ9++mmpbZhf/1YT8+vrPvjgAy1cuND37/j4+BJvwTF27FhJqtIvnB49eqhv3776+OOPlZSUpMTERDVs2LBcfatrvtY25X1vXGu+V0Rpnz9VtY6lS5fq3XfflZ+fny/MXB173rx58ng81XI/tk2bNkmShgwZIsld2wzuVW3h7b333lNhYaF69epVapsuXbrIz89PmzdvrrL1Xr0CsrQ7hLdu3VqNGjXSn//85+taT5cuXRQWFlbsw2Dbtm3Kzc0t8T5UZdm7d6/y8vI0ZcoUtW3bVl6v95qX7V+9I3557i906tQpX9CJjo7Ws88+q169epUZfqrC1dpuuOGGKh33k08+kSR17Nix1DbMr3+r7vn1TZ988okaNGjg+3dOTk6Jc+3qVaHdunWr8DrKcnXv29q1a/Wzn/2s3P2qa77WNuV9b1xrvpdXWZ8/VbWOZcuWFQsyZ8+elfSvq0/NrNhpCNfr9OnTWrhwoWJiYvTQQw9Jctc2g3tVWXjLzc3VxYsXlZ+fr507d2rq1Klq1aqV7zLqkkRHRys+Pl5r167V0qVLlZGRodTUVC1ZsqTSdXi9Xk2YMEGrVq3Syy+/rIyMDBUUFOjkyZP65z//qeDgYD355JP64IMPNHXqVH355ZcqLCxUZmam743SqFEjnTp1SseOHVNmZmaJh369Xq+mT5+uN954QytXrlRGRob27t2ryZMnq3nz5kpKSqpQ3Vf3UL7zzju6cuWKDh8+fM3zmkJDQ9W2bVvfIZCynDp1SpMmTdKBAweUm5urXbt26fjx4+rbt2+5X3NlXK3t6p6yjRs3VvgS/ezsbBUWFsrMdOrUKS1btkxPP/20mjRpUuYXM/Pr36p7fl2Vl5enM2fO6P333y8S3iRpxIgRSklJUXp6ui5evKg333xTjz/+uO69994qD2/333+/mjRpohEjRqht27bl7vfN+VpXlfe9ca35Xl5lff5U1ToqoqKfQ2amS5cu+T6Hrv7aS//+/eXv769169b5znmrq9sMtcw3r2CozNWmy5Yts0GDBlnTpk0tICDAGjdubKNHj7bjx4/72rz44ovWrFkzk2ShoaG+S7kzMzNt4sSJ1rhxYwsLC7O4uDibNWuWSbKYmBjbs2dPiX3nz59vISEhvku0V6xY4VtXTk6OzZw501q2bGkBAQEWHR1t8fHxtn//fl+bl156ybp27Wper9e8Xq/17NnTFi9ebGZmO3futFatWllISIjFxcXZ008/XWLthYWFtmDBAuvQoYMFBgZaw4YNbcSIEXbw4EHfehYvXmyhoaEmyTp06GBHjx61JUuWWEREhEmyVq1a+W6nMnPmTGvUqJFFRUVZYmKivfTSSybJ2rVrZ4888ojdcMMNJskaNGhg9913n5mZTZ061QIDA31X1pmZPf/888XaHjt2zPr162cNGzY0f39/u/HGG+2pp56y/Pz8El/zE0884au7devW9re//c1+9atfWWRkpEmyG264wf7whz/Y6tWrfetq2LChrVq1qsjcGDp0qLVo0cIKCwvNzGzDhg0WHh5uc+fOLXU+vfHGG6VeaRocHGwdOnSwKVOm2IkTJ5hfDsyvsv4+X3+88cYbvj5//vOfbeTIkdauXTsLDg62oKAg69ixo82ePduuXLlSbA5kZGTY7bffbo0aNTJJ5ufnZ+3bt7d58+aVOleaNGliP/nJT3zPPfbYY/bRRx/5/v317ezn52c333yz/e1vfysy3jfna3nJZVebmpXvvWFW9nwv7/y71udPed5TlVHa1abl+Rxav369devWzUJDQy0oKMj8/PxMku/K0ltvvdXmzJlT4u1u3LTNuNrUnaokvME5hw8ftoCAgCLhorb46quvzOv12nPPPed0Kaik2jy/qtr1zFc3hjfAjPDmVtV6tSmqX/v27TVnzhzNmTNHly5dcrqcImbPnq0ePXpo6tSpTpeCSqrN86uqMV8BuAXhrQ544oknlJiYqFGjRtWaH0d+4YUXtHv3bm3YsKFKf/cWNa82zq+qxnytnQ4cOFDkVhilPUaNGuV0qUCNIrzVEfPmzdPUqVPLvO9ZTXnzzTeVk5Oj999/v9y3aEDtVpvmV1VjvtZenTp1KnYFaUmPqzdkBuqLAKcLQNX57ne/W+YPb9eUe++9V/fee6/TZaCK1Zb5VdWYrwDchj1vAAAALkJ4AwAAcBHCGwAAgIsQ3gAAAFyE8AYAAOAihDcAAAAXIbwBAAC4COENAADARQhvAAAALkJ4AwAAcBHCGwAAgIsQ3gAAAFyE8AYAAOAiAaU9kZiYWJN1AK6Qnp6usLAwBQSU+tYBHFObPrfPnj2r6Ohop8vANWzdulV9+/Z1ugxUULE9b7GxsUpISHCiFqBWMzNt27ZNf/3rX5WRkeF0OahFEhISFBsb69j6a9Pndm5urrZu3arNmzfrwoULTpeDa+jbt69uu+02p8tABXnMzJwuAnCLkydPavTo0dqxY4d+9atfadq0aU6XBNQaf/nLXzRhwgQVFBRo6dKluvvuu50uCaiTOOcNqICYmBi99957mjlzph599FElJCTo4sWLTpcFOCo7O1uPP/64vve976lfv37av38/wQ2oRux5Ayrp3Xff1YMPPqiIiAglJyerR48eTpcE1Ljt27dr7NixOn36tF588UWNHTvW6ZKAOo89b0AlDR48WHv27FGrVq3Ut29fLVq0yOmSgBqTn5+v+fPnKy4uTrGxsdq3bx/BDagh7HkDrlNBQYGeeeYZzZ07V8OGDdPSpUvVsGFDp8sCqs0//vEPjRs3Tjt27NDs2bM1Y8YM+fmxLwCoKYQ3oIq89957GjNmjIKCgrR69Wouv0edtHz5cj388MNq06aNVq5cqW7dujldElDv8F8loIoMGjRIe/bs0be+9S3dfvvtmj9/vvi/EeqKM2fOaNiwYXrooYf08MMPa8eOHQQ3wCHseQOqmJnpN7/5jWbMmKG7775bv//979WoUSOnywIq7Y033lBSUpLCwsK0fPlyDRgwwOmSgHqNPW9AFfN4PJo2bZreeecd7dixQz179tRHH33kdFlAhWVkZCgpKUnx8fEaMmSI9u7dS3ADagHCG1BNbr/9du3evVtdunTRd77zHc2ePVuFhYVOlwWUy5YtW9SrVy+tW7dO69at0/LlyxUWFuZ0WQBEeAOqVZMmTfTWW2/pueee0y9/+Uvde++9OnfunNNlAaXKy8vT7NmzNWDAAN10003avXu37r33XqfLAvA1nPMG1JCPP/5Yo0aNUl5enlatWqW4uDinSwKK2L9/v8aOHasjR47oueee049//GOnSwJQAva8ATXk1ltv1a5du9S3b18NGjSIw6ioNcxMixYt0i233KLg4GDt3LmT4AbUYux5A2rY1atRH3vsMQ0YMEArV65Us2bNnC4L9dSJEyf0gx/8QB9++KGefPJJ/cd//If8/f2dLgtAGdjzBtSwq1ejfvjhhzp27Ji6d++uv/zlL06XhXpozZo16tGjh9LS0rR161bNnj2b4Aa4AOENcEjv3r21c+dODRw4UEOGDNHs2bNVUFDgdFmoB9LT0zVmzBiNHDlSiYmJ2r59u3r16uV0WQDKicOmQC2wZMkSTZ06Vf369dPKlSt14403Ol0S6qg///nPeuihh+Tv76///u//1qBBg5wuCUAFsecNqAV+/OMf66OPPtIXX3yhHj16aNOmTU6XhDomOztb06ZN0/e+9z3169dPu3btIrgBLkV4A2qJXr16aefOnbrjjjt09913a9q0acrLy3O6LNQBH3/8sXr27Knly5dr+fLlSklJ4SfbABcjvAG1SHh4uP73f/9X//3f/63XXntNd9xxh7788kuny4JL5efna/78+YqLi1NsbKz27t2rBx980OmyAFwnwhtQC40bN07bt2/XuXPn1KNHD23YsMHpkuAyn3/+uQYOHKhf/OIXeuaZZ/T2228rJibG6bIAVAHCG1BL3Xzzzdq6davuuusuff/73+cwKsrFzLRkyRJ169ZNubm52rlzp2bOnCk/Pz7ugbqCq00BF1i+fLmmTJmizp07a/Xq1WrTpo3TJaEWOnPmjCZOnKhNmzZp+vTpmjNnjoKCgpwuC0AV479igAuMGzdOO3bsUHZ2tnr27Km1a9c6XRJqmddff12dO3fW/v379d577+lXv/oVwQ2oowhvgEt06tRJ27Zt0w9+8APdf//9mjZtmnJzc50uCw7LyMhQUlKSEhISdPfddys1NVVxcXFOlwWgGnHYFHChFStWaMqUKerUqZNWr16tdu3aOV0SHPDRRx9p3LhxyszM1Kuvvqphw4Y5XRKAGsCeN8CFxo4dqx07digvL0+9evVSSkqK0yWhBuXk5Ojxxx/XgAED1L17d+3fv5/gBtQjhDfApTp27KitW7dq/PjxGjlypJKSkpSTk+N0Wahm+/btU9++ffXb3/5Wv/3tb/X666+rSZMmTpcFoAYR3gAX83q9WrRokV5//XWlpKSoX79+OnLkiNNloRqYmRYtWqTevXsrJCREn3zyiX784x87XRYABxDegDrgvvvu065duxQYGKhevXpp1apVTpeEKnT8+HENGjRIM2bM0OOPP66//e1vat++vdNlAXAI4Q2oI1q3bq3NmzdrwoQJeuCBBzRu3DhlZWU5XRau05o1a9SzZ0999dVX2rZtm2bPni1/f3+nywLgIMIbUIcEBwdr0aJF+r//+z+99dZb6tOnj/bv3+90WaiEs2fP6r777tPIkSM1duxYffLJJ+rZs6fTZQGoBQhvQB00fPhw7d69W5GRkerbt6/+8Ic/OF0SKuDtt99Wjx499Mknn+jdd9/VokWLFBwc7HRZAGoJwhtQR7Vs2VIffPCBHn74YY0dO1bjxo3T5cuXnS4LZcjOzta0adM0ZMgQ9e/fX7t379agQYOcLgtALcNNeoF6YP369ZowYYKaNm2qlJQUde3a1emS8A0ff/yxxo4dq7S0NL300ksaM2aM0yUBqKXY8wbUA8OGDdPu3bvVuHFjffvb39aiRYtKbWtm+tOf/lSD1dVtp06dKvP5/Px8zZ8/X3FxcWrVqpX27dtHcANQJsIbUE/Exsbq/fff12OPPaZHH31U48aN06VLl4q1W7hwoYYPH65PPvnEgSrrlrNnz6pPnz76y1/+UuLzBw4c0G233aZf/OIXWrBggd5++221aNGihqsE4DYcNgXqoXfeeUcPPvigoqKilJycrO7du0uStm3bpv79+6uwsFA33XST9uzZw4nylWRmGjp0qDZu3KimTZvqs88+U6NGjXzPvfrqq3r00Ud18803a8WKFerYsaPDFQNwC/a8AfXQHXfcoR07dqhp06bq27evFi1apPT0dCUkJMjj8cjMdOTIEf3nf/6n06W61osvvqi3335bknThwgU99NBDkqTTp0/rnnvu0cMPP6yf/OQn+vDDDwluACqEPW9APZafn6///M//1K9+9Su1adNGx48fV35+vu95j8ejDz74QHFxcQ5W6T779u3TLbfcotzc3CLLp02bppUrVyoyMlLLly9X//79HaoQgJsR3gBoypQpeuWVV/TNjwN/f3/fSfQhISEOVecuV65cUc+ePXXkyJFiQdjf31/333+/fve73yksLMzBKgG4GYdNgXouNTVVr732WrHgJkkFBQU6ceKEnnzySQcqc6ef/vSnxYKb9K/z3Dwej44dO6bQ0FCHqgNQF7DnDajHLl26pO7du+vEiRPFwsbXeTwevffee/rOd75Tg9W5zxtvvKH4+Pgy2/j5+en555/XI488UkNVAahrCG9APZaYmKi1a9des52/v79atGihTz/9VA0aNKiBytznxIkT6tKliy5dulTiXsyvCwwM1K5du9S5c+caqg5AXcJhU6CeOn/+vBo3bqzo6GhJKvOWIAUFBTp16pQee+yxmirPVQoKCjRy5EhduXLlmsEtKChIeXl5mjx58jXbAkBJ2PMGQPv379eaNWv0v//7vzp8+LACAwOVn59fLFx4PB5t3LhRd911l0OV1k6zZ8/WM888o8LCwmLP+fv7y8xUWFioli1bavjw4brnnns0YMAA7qEHoFIIbwCK+Oyzz7Ru3TqtXbtWu3btkp+fny98+Pn5qVmzZvrss88UERHhdKm1wt///nd95zvf8QU3j8ejwMBA5ebmKjIyUnfccYe++93v6u6771ZMTIzD1QKoCwhvqJVOnjypjz76yOky6r3z589rx44d2rZtmz799FNfQBk0aJAmTZrkcHXOu3z5sqZPn64LFy5I+tfFCO3bt9ctt9yi7t27q3Xr1vJ4PA5XWf/Exsbqtttuc7oMoNoQ3lArpaSkaOTIkU6XAcCFEhIStGbNGqfLAKpNgNMFAGXh/xa105UrV/TZZ5+pZ8+eTpfimEuXLunMmTNq166d06XgaxITE50uAah2hDcAFeb1eut1cJOksLAwfiUBgCO4VQgAAICLEN4AAABchPAGAADgIoQ3AAAAFyG8AQAAuAjhDQAAwEUIbwAAAC5CeAMAAHARwhsAAICLEN4AAABchPAGAADgIoQ3AAAAFyG8AQAAuAjhDfXes88+q8jISHk8Hu3evdvpcsptwoQJ8nq98ng8unLlSp2po0+fPvL391ePHj0qPcaGDRsUGRmpP/7xj6W2mThxosLDw6/7737w4EH99Kc/VefOnRUeHq6AgABFRkbqpptu0tChQ7Vly5ZKjw0AJSG8od574okn9Lvf/c7pMips2bJl+vnPf+50GVVex/bt2zVo0KDrGsPMrtnmtdde06uvvnpd61m6dKm6du2q1NRUvfDCC/riiy90+fJl7dq1S88884zS09O1d+/e61oHAHxTgNMFAFUlOztbgwcP1kcffeR0KagCHo+n0n2HDh2qixcvVmE1xW3dulVJSUn6zne+o7ffflsBAf/+OG3btq3atm2rqKgoHT58uFrruB5Ovmd4vwKVR3hDnbF06VKlpaU5XYYjrifoVKWqrCMwMLDKxirN9dQ7d+5cFRQU6Nlnny0S3L7urrvu0l133VXpdVQ3J98z9fn9ClwvDpuiTnjkkUc0ffp0HT16VB6PR+3bt5f0r8NnL7zwgr71rW8pODhYDRs21PDhw3XgwIEyxztz5oxat26tgIAAfe973/MtLygo0KxZs9SyZUuFhISoW7duSk5OliS9/PLLatCggUJDQ/Xmm29qyJAhioiIUExMjFatWlXp17ZixQr17t1bXq9XDRo0UOvWrfXMM8/4nvfz89Of/vQnDRkyRJGRkWrevLl+//vfFxnjb3/7m26++WZFRkbK6/Wqa9euevvttyVJv/71rxUaGqrw8HClpaVp+vTpatGihQ4ePFihOq9Vx8SJE+XxeOTxeNSuXTvt2rVL0r/OmQsNDVVkZKTWr1/va3/kyBF16tRJDRo0UEhIiAYMGKC///3vvudLq3vp0qVq2bKlPB6PXnrpJV97M9OCBQvUsWNHBQcHKzIyUjNmzCj2OjZt2qSIiAjNmzev1Neam5urd999V40bN9att95a7m1UnvlY0XlU1vwo6+9e2numquZ4Va8bwNcYUAslJydbRadnfHy8tWvXrsiyWbNmWVBQkK1YscLS09MtNTXVevXqZU2aNLHTp0/72q1atcok2a5du8zMLDc31+Lj4+3NN98sMt7Pf/5zCw4OtrVr19qFCxfsySefND8/P9u+fbuZmT311FMmyd599127ePGipaWl2YABA6xBgwaWm5tb4e2wcOFCk2TPPvusnTt3zs6fP2+/+93vbMyYMcXWl56ebufPn7e7777bgoOD7fLly75x1qxZY7Nnz7bz58/buXPnrG/fvta4cWPf81fHmTZtmr344ot233332WeffVbuOstbR3x8vPn7+9uXX35ZpP8DDzxg69ev9/178ODB1rZtW/vHP/5heXl5tm/fPvv2t79tXq/XDh06dM26v/gK3whSAAAgAElEQVTiC5NkL774YpG2Ho/Hnn/+ebtw4YJlZWXZ4sWLi/zdzczeeustCw8Ptzlz5pT6eg8dOmSSrG/fvuXeRmbln4/lnUfXmh/X+ruX9J6pqjleHesuj4SEBEtISCh3e8CNCG+olaoivGVlZVlYWJiNGjWqSLuPP/7YJBX5cv56eMvLy7PRo0fbxo0bi/TLzs620NDQIuNlZWVZcHCwTZkyxcz+/cWWnZ3ta3M1IBw5cqRCryc3N9eioqJs0KBBRZbn5+fbf/3Xf5W6vuXLl5sk27dvX6lj//KXvzRJlpaWVuo4FVHeOt555x2TZHPnzvUtu3jxonXo0MHy8/N9ywYPHmzdu3cvso7U1FSTZD//+c/LXK+ZFQtvWVlZFhoaanfeeWeRdt8M7eW1Y8cOk2R33HFHuftUZD6WZx6VZ3580zf/7t98z1TnHK+KdZcH4Q31AYdNUWft379fly5dUu/evYss79Onj4KCgrRt27ZifQoKCvTAAw+oadOmRQ6XSv+6JURWVpa6dOniWxYSEqJmzZqVeRg2KChIkpSXl1eh+lNTU5Wenl7snCl/f39Nmzat1H5XzxUra31X2xQUFFSopoooqY7/9//+n2666Sb9/ve/910Runr1ao0aNUr+/v5ljte1a1dFRkYqNTW1wrUcOXJEWVlZGjx4cIX7liQsLEySlJWVVe4+lZmPX/fNeVSZ+XGtv3t1zvHqWjdQHxHeUGelp6dL+vcX7ddFRUUpMzOz2PKf/OQnOnz4sF555RV9+umnRZ67fPmyJOnpp5/2nbvl8Xh0/PjxCn2Jl1dGRoav1uv1pz/9SQMHDlR0dLSCg4P12GOPXfeYleHxeDRp0iR9/vnnevfddyVJy5cv1w9/+MNy9Q8MDKxwCJakkydPSpKio6Mr3LckrVu3ltfr1aFDh8rdpzLzsSzlmR8V/btX5Rx3ct1AXUd4Q5119UutpC/F9PR0xcTEFFt+//336y9/+YuioqI0btw45efn+567+sW/cOFC2b9OOfA9quNGrDfeeKMk6auvvrqucU6cOKERI0aoWbNm2rZtmy5evKj58+dXRYmVMn78eHm9Xr322ms6ePCgIiIi1KpVq2v2y8/P1/nz59WyZcsKr9Pr9UqScnJyKty3JMHBwbrrrrv01Vdf6cMPPyy13fnz5zVx4kRJlZuPZbnW/KjM372q5riT6wbqA8Ib6qwuXbooLCxMO3bsKLJ827Ztys3N1S233FKsz6BBg9SkSRMtWbJEn3zyiebOnet7LjY2Vl6vt8Z+haF169Zq1KiR/vznP1/XOHv37lVeXp6mTJmitm3b+n4NwSkNGzbUyJEjtW7dOj333HP60Y9+VK5+7733ngoLC9WrV68Kr7NLly7y8/PT5s2bK9y3NLNnz1ZwcLAeffRRZWdnl9hm3759vtuIVGY+luVa86Myf/eqmuNOrhuoDwhvqDMaNWqkU6dO6dixY8rMzJS/v7+mT5+uN954QytXrlRGRob27t2ryZMnq3nz5kpKSip1rGHDhmn8+PGaN2+ePvnkE0n/2nszYcIErVq1Si+//LIyMjJUUFCgkydP6p///GeVv57g4GA9+eST+uCDDzR16lR9+eWXKiwsVGZmZrFDumW5uqfqnXfe0ZUrV3T48OFrnl9V3SZPnqycnBy99dZbuueee0psk5ubq4sXLyo/P187d+7U1KlT1apVK40fP77C64uOjlZ8fLzWrl2rpUuXKiMjQ6mpqVqyZEmxths3brzmrUIkqUePHvrDH/6gffv2acCAAdqwYYMuXryovLw8/eMf/9Crr76qH/7wh75zvbxeb6XnY0muNT/K83cv6T1TFXPcyXUD9YIjl0kA11CZq0137txprVq1spCQEIuLi7PTp09bYWGhLViwwDp06GCBgYHWsGFDGzFihB08eNDX7/XXX7eGDRuaJGvdurWlpaVZRkaGxcbGmiQLCwuz5cuXm5lZTk6OzZw501q2bGkBAQEWHR1t8fHxtn//flu8eLGFhoaaJOvQoYMdPXrUlixZYhERESbJWrVqVeQ2F+X10ksvWdeuXc3r9ZrX67WePXva4sWLbf78+RYSElJkfStXrvS9lpiYGN+VnjNnzrRGjRpZVFSUJSYm2ksvvWSSrF27dvaTn/zEN05sbKytWLGiQvVVpI6v69mzpz3xxBMljrls2TIbNGiQNW3a1AICAqxx48Y2evRoO378eInr/XrdL774ojVr1swkWWhoqA0bNszMzDIzM23ixInWuHFjCwsLs7i4OJs1a5avxj179piZ2YYNGyw8PLzIFbFlOXHihP385z+3rl27WlhYmPn7+1tUVJT17NnTfvjDH9qHH37oa1ue+VjReVTa/DAr++9+4sSJEt8zVTXHq3rd5cXVpqgPPGbl+BFAoIalpKRo5MiR5fqNSrjT0KFD9dJLL6lNmzZOl4I6JDExUZK0Zs0ahysBqg+HTQHUiK9fJZqamiqv10twA4BKILwBNejAgQNFboNQ2mPUqFF1rs6ZM2fq8OHDOnTokCZMmFDkJ74AAOXHD9MDNahTp06uOBRcHXWGhoaqU6dOatGihRYvXqybb765SscHgPqCPW8AasTcuXNVUFCgEydOlHqFKQDg2ghvAAAALkJ4AwAAcBHCGwAAgIsQ3gAAAFyE8AYAAOAihDcAAAAXIbwBAAC4COENAADARQhvAAAALkJ4AwAAcBHCGwAAgIsQ3gAAAFyE8AYAAOAiAU4XAJQlJSXF6RIAuMjJkycVExPjdBlAtSK8oVYbOXKk0yUAcJmEhASnSwCqlcfMzOkiAKAiPB6PkpOTdf/99ztdCgDUOM55AwAAcBHCGwAAgIsQ3gAAAFyE8AYAAOAihDcAAAAXIbwBAAC4COENAADARQhvAAAALkJ4AwAAcBHCGwAAgIsQ3gAAAFyE8AYAAOAihDcAAAAXIbwBAAC4COENAADARQhvAAAALkJ4AwAAcBHCGwAAgIsQ3gAAAFyE8AYAAOAihDcAAAAXIbwBAAC4COENAADARQhvAAAALkJ4AwAAcBHCGwAAgIsQ3gAAAFyE8AYAAOAihDcAAAAXIbwBAAC4COENAADARQhvAAAALkJ4AwAAcJEApwsAgLIsWbJEFy5cKLb8zTff1D/+8Y8iy8aPH68bbrihpkoDAEd4zMycLgIASpOUlKQlS5YoODjYt8zM5PF4fP/Oz89XZGSkTp8+rcDAQCfKBIAaw2FTALXa6NGjJUk5OTm+R25ubpF/+/n5afTo0QQ3APUCe94A1GqFhYVq3ry50tLSymz397//Xf3796+hqgDAOex5A1Cr+fn56cEHH1RQUFCpbZo3b65+/frVYFUA4BzCG4Bab/To0crNzS3xucDAQI0bN67IOXAAUJdx2BSAK7Rt27bY1aVX7d69W927d6/higDAGex5A+AK48aNK/GChLZt2xLcANQrhDcArvDggw8qLy+vyLLAwEBNmDDBoYoAwBkcNgXgGt26ddO+ffv09Y+tQ4cOqUOHDg5WBQA1iz1vAFxj3Lhx8vf3lyR5PB717NmT4Aag3iG8AXCNBx54QAUFBZIkf39//eAHP3C4IgCoeYQ3AK5x4403ql+/fvJ4PCosLFRiYqLTJQFAjSO8AXCVsWPHysx0++2368Ybb3S6HACocVywADggJSVFI0eOdLoM1FMJCQlas2aN02UAqKQApwsA6rPk5GSnS3Cl559/XklJSQoLC3O6FNdZuHCh0yUAuE6EN8BB999/v9MluFK/fv0UExPjdBmuxB43wP045w2A6xDcANRnhDcAAAAXIbwBAAC4COENAADARQhvAAAALkJ4AwAAcBHCGwAAgIsQ3gAAAFyE8AYAAOAihDcAAAAXIbwBAAC4COENAADARQhvAAAALkJ4AwAAcBHCG+BSEydOVHh4uDwej3bv3u10OY55/fXX1bZtW3k8niKPoKAgNW3aVAMHDtSCBQt04cIFp0sFgCpBeANc6rXXXtOrr77qdBmOi4+P1+eff6527dopMjJSZqbCwkKlpaUpJSVFbdq00cyZM9W5c2ft2LHD6XIB4LoR3gDUCtnZ2erXr1+VjOXxeBQVFaWBAwdq2bJlSklJ0ZkzZzR06FBdvHixStbhpKrcVgDch/AGuJjH43G6hCqzdOlSpaWlVcvYCQkJGj9+vNLS0vTKK69UyzpqUnVuKwC1H+ENcAkz04IFC9SxY0cFBwcrMjJSM2bMKNLm17/+tUJDQxUeHq60tDRNnz5dLVq00MGDB2VmeuGFF/Stb31LwcHBatiwoYYPH64DBw74+v/mN7+R1+tV06ZNNWnSJDVv3lxer1f9+vXTtm3bitVzrfGmTp2qoKAgNWvWzLfs4YcfVoMGDeTxePTVV19Jkh555BFNnz5dR48elcfjUfv27SVJmzZtUkREhObNm3fd22/8+PGSpI0bN9bJbQWgHjEANS45Odkq+vZ76qmnzOPx2PPPP28XLlywrKwsW7x4sUmyXbt2FWknyaZNm2Yvvvii3XffffbZZ5/ZrFmzLCgoyFasWGHp6emWmppqvXr1siZNmtjp06d9/ZOSkqxBgwb26aef2pUrV2z//v3Wp08fCw8PtxMnTvjalXe8MWPG2A033FDktSxYsMAk2dmzZ33L4uPjrV27dkXavfXWWxYeHm5z5sy55vZp166dRUZGlvp8RkaGSbLY2Ng6ua3KKyEhwRISEirVF0DtQHgDHFDR8JaVlWWhoaF25513Flm+atWqUsNbdnZ2kf5hYWE2atSoIv0//vhjk1QkHCUlJRULQdu3bzdJ9otf/KLC49VEIDG7dngzM/N4PBYVFeX7d33cVoQ3wP04bAq4wJEjR5SVlaXBgwdXqv/+/ft16dIl9e7du8jyPn36KCgoqNhhvm/q3bu3QkNDfYf5rnc8J1y+fFlmpoiIiDLbsa0A1HaEN8AFTp48KUmKjo6uVP/09HRJUlhYWLHnoqKilJmZec0xgoODdfbs2Sobr6YdOnRIktSpU6cy27GtANR2hDfABbxeryQpJyenUv2joqIkqcSgkJ6erpiYmDL75+XlFWl3veM5YdOmTZKkIUOGlNmObQWgtiO8AS7QpUsX+fn5afPmzZXuHxYWVuwmtdu2bVNubq5uueWWMvu///77MjP17du3wuMFBAQoLy+vUnVXldOnT2vhwoWKiYnRQw89VGbb+r6tANR+hDfABaKjoxUfH6+1a9dq6dKlysjIUGpqqpYsWVKu/l6vV9OnT9cbb7yhlStXKiMjQ3v37tXkyZPVvHlzJSUlFWlfWFioCxcuKD8/X6mpqXrkkUfUsmVL3+02KjJe+/btdf78ea1bt055eXk6e/asjh8/XqzGRo0a6dSpUzp27JgyMzOVl5enjRs3VuhWIWamS5cuqbCwUGams2fPKjk5Wf3795e/v7/WrVt3zXPe3LqtANQjjl4uAdRTlblVSGZmpk2cONEaN25sYWFhFhcXZ7NmzTJJFhMTY3v27LH58+dbSEiI75YYK1as8PUvLCy0BQsWWIcOHSwwMNAaNmxoI0aMsIMHDxZZT1JSkgUGBlqLFi0sICDAIiIibPjw4Xb06NEi7co73rlz52zQoEHm9XqtTZs29tOf/tRmzJhhkqx9+/a+W2rs3LnTWrVqZSEhIRYXF2enT5+2DRs2WHh4uM2dO7fU7bJ+/Xrr1q2bhYaGWlBQkPn5+Zkk35Wlt956q82ZM8fOnTtXpF9d21blxdWmgPt5zMwczI5AvZSSkqKRI0eqNr79Jk2apDVr1ujcuXNOl1LruXFbJSYmSpLWrFnjcCUAKovDpgCKKSgocLoE12BbAahphDcAAAAXIbwB8HnyySe1bNkyXbx4UW3atNHatWudLqnWYlsBcArnvAEOqM3nvKFu45w3wP3Y8wYAAOAihDcAAAAXIbwBAAC4COENAADARQhvAAAALkJ4AwAAcBHCGwAAgIsQ3gAAAFyE8AYAAOAihDcAAAAXIbwBAAC4COENAADARQhvAAAALhLgdAFAfebxeJwuAfVQQkKC0yUAuA4eMzOniwDqm5MnT+qjjz5yugzXGjlypB555BHddtttTpfiSrGxsWw7wMUIbwBcx+PxKDk5Wffff7/TpQBAjeOcNwAAABchvAEAALgI4Q0AAMBFCG8AAAAuQngDAABwEcIbAACAixDeAAAAXITwBgAA4CKENwAAABchvAEAALgI4Q0AAMBFCG8AAAAuQngDAABwEcIbAACAixDeAAAAXITwBgAA4CKENwAAABchvAEAALgI4Q0AAMBFCG8AAAAuQngDAABwEcIbAACAixDeAAAAXITwBgAA4CKENwAAABchvAEAALgI4Q0AAMBFCG8AAAAuQngDAABwEcIbAACAixDeAAAAXITwBgAA4CIBThcAAGU5fvy4CgoKii0/c+aMPv/88yLLmjdvrpCQkJoqDQAc4TEzc7oIACjNkCFDtGnTpmu2CwgI0OnTp9W4ceMaqAoAnMNhUwC12qhRo+TxeMps4+fnpzvvvJPgBqBeILwBqNXuu+8+BQYGXrPd2LFja6AaAHAe4Q1ArRYeHq7vf//7ZQa4wMBA3XPPPTVYFQA4h/AGoNYbM2aM8vPzS3wuICBAI0aMUFhYWA1XBQDOILwBqPWGDh2qBg0alPhcQUGBxowZU8MVAYBzCG8Aar3g4GAlJCQoKCio2HNhYWH67ne/60BVAOAMwhsAV3jggQeUm5tbZFlgYKBGjRpVYqgDgLqK+7wBcIXCwkLdcMMN+uqrr4osf++99zRw4EBnigIAB7DnDYAr+Pn56YEHHiiyly06OloDBgxwsCoAqHmENwCuMXr0aN+h06CgII0bN07+/v4OVwUANYvDpgBcw8zUqlUrffHFF5Kk7du3q3fv3g5XBQA1iz1vAFzD4/Fo3LhxkqRWrVoR3ADUSwFOFwCguC1btuiFF15wuoxaKSMjQ5LUoEEDJSYmOlxN7XTbbbfp0UcfdboMANWEPW9ALfTFF19o7dq1TpdRK0VERCgyMlIxMTFOl1Irbd26VVu2bHG6DADViD1vQC22Zs0ap0uold5++23dddddTpdRK7E3Eqj72PMGwHUIbgDqM8IbAACAixDeAAAAXITwBgAA4CKENwAAABchvAEAALgI4Q0AAMBFCG8AAAAuQngDAABwEcIbAACAixDeAAAAXITwBgAA4CKENwAAABchvAEAALgI4Q2ooyZOnKjw8HB5PB7t3r3b6XIqZe7cufJ4PMUeXbp0qfBYr7/+utq2bVtsrKCgIDVt2lQDBw7UggULdOHChWp4JQBQdQhvQB312muv6dVXX3W6jFojPj5en3/+udq1a6fIyEiZmQoLC5WWlqaUlBS1adNGM2fOVOfOnbVjxw6nywWAUhHeANRqK1askJkVeezbt69KxvZ4PIqKitLAgQO1bNkypaSk6MyZMxo6dKguXrxYJesAgKpGeAPqMI/H43QJrpKQkKDx48crLS1Nr7zyitPlAECJCG9AHWFmWrBggTp27Kjg4GBFRkZqxowZxdoVFBRo1qxZatmypUJCQtStWzclJydLkl5++WU1aNBAoaGhevPNNzVkyBBFREQoJiZGq1atKjLO5s2bdeuttyo0NFQRERHq2rWrMjIyrrmO6rBp0yZFRERo3rx51z3W+PHjJUkbN270LauL2wyAexHegDriP/7jPzRz5kwlJSXpzJkzOn36tB5//PFi7R5//HH9+te/1sKFC/XPf/5T99xzjx544AHt2LFDU6ZM0c9+9jNlZ2crPDxcycnJOnr0qNq2basf/ehHysvLkyRdvnxZw4YNU0JCgs6fP6/Dhw/rpptuUm5u7jXXUVFPPPGEGjZsqKCgILVp00bDhw/X9u3bi7QpKCiQJBUWFlZ4/G/q0aOHJOnzzz/3LXPbNgNQxxmAWic5Odkq8vbMysqy0NBQu/POO4ssX7VqlUmyXbt2mZlZdna2hYaG2qhRo4r0DQ4OtilTppiZ2VNPPWWSLDs729dm8eLFJsmOHDliZmb79u0zSfbWW28Vq6U86yivEydO2M6dOy0zM9NycnJsy5Yt1rNnTwsJCbF9+/ZVaKyr2rVrZ5GRkWW28Xg8FhUVZWbu22YJCQmWkJBQoT4A3IU9b0AdcOTIEWVlZWnw4MFltjt48KCysrKK3GojJCREzZo104EDB0rtFxQUJEm+vUht27ZV06ZN9eCDD2r27Nk6duzYda+jJLGxserZs6fCwsIUFBSkvn37atmyZcrOztbixYsrNFZ5Xb58WWamiIgISe7bZgDqPsIbUAecPHlSkhQdHV1mu8uXL0uSnn766SL3Ojt+/LiysrLKvb6QkBD99a9/VVxcnObNm6e2bdtq1KhRys7OrrJ1lKZr167y9/fXoUOHrnusklwdt1OnTpLqxjYDULcQ3oA6wOv1SpJycnLKbHc13C1cuLDY7Te2bNlSoXV27txZf/zjH3Xq1CnNnDlTycnJeu6556p0HSUpLCxUYWGhgoODr3uskmzatEmSNGTIEEl1Y5sBqFsIb0Ad0KVLF/n5+Wnz5s1ltouNjZXX673uX1w4deqUPv30U0n/CjfPPvusevXqpU8//bTK1iFJd911V7Fl27dvl5nptttuu+7xv+n06dNauHChYmJi9NBDD0ly3zYDUPcR3oA6IDo6WvHx8Vq7dq2WLl2qjIwMpaamasmSJUXaeb1eTZgwQatWrdLLL7+sjIwMFRQU6OTJk/rnP/9Z7vWdOnVKkyZN0oEDB5Sbm6tdu3bp+PHj6tu3b5WtQ5K+/PJLrV69Wunp6crLy9OWLVs0ceJEtWzZUpMnT/a127hxY4VuFWJmunTpkgoLC2VmOnv2rJKTk9W/f3/5+/tr3bp1vnPe3LbNANQDNXt9BIDyqOjVpmZmmZmZNnHiRGvcuLGFhYVZXFyczZo1yyRZTEyM7dmzx8zMcnJybObMmdayZUsLCAiw6Ohoi4+Pt/3799vixYstNDTUJFmHDh3s6NGjtmTJEouIiDBJ1qpVKzt06JAdO3bM+vXrZw0bNjR/f3+78cYb7amnnrL8/PxrrqMipk+fbu3atbMGDRpYQECAxcTE2I9+9CM7depUkXYbNmyw8PBwmzt3bqljrV+/3rp162ahoaEWFBRkfn5+Jsl3Zemtt95qc+bMsXPnzhXr66ZtxtWmQN3nMTNzMDsCKEFKSopGjhwp3p6oqMTEREnSmjVrHK4EQHXhsCkAAICLEN4A1JgDBw4UuRVGaY9Ro0Y5XSoA1FoBThcAoP7o1KkTh4IB4Dqx5w0AAMBFCG8AAAAuQngDAABwEcIbAACAixDeAAAAXITwBgAA4CKENwAAABchvAEAALgI4Q0AAMBFCG8AAAAuQngDAABwEcIbAACAixDeAAAAXITwBgAA4CIBThcAoHSJiYlOlwCX2bp1q/r27et0GQCqEXvegFooNjZWCQkJTpdRa61fv16nTp1yuoxaqW/fvrrtttucLgNANfKYmTldBABUhMfjUXJysu6//36nSwGAGseeNwAAABchvAEAALgI4Q0AAMBFCG8AAAAuQngDAABwEcIbAACAixDeAAAAXITwBgAA4CKENwAAABchvAEAALgI4Q0AAMBFCG8AAAAuQngDAABwEcIbAACAixDeAAAAXITwBgAA4CKENwAAABchvAEAALgI4Q0AAMBFCG8AAAAuQngDAABwEcIbAACAixDeAAAAXITwBgAA4CKENwAAABchvAEAALgI4Q0AAMBFCG8AAAAuQngDAABwEcIbAACAixDeAAAAXITwBgAA4CKENwAAABfxmJk5XQQAlGbs2LHavXt3kWXHjh1TdHS0GjRo4FsWGBioP/7xj2rRokVNlwgANSrA6QIAoCwdO3bUypUriy2/dOlSkX936tSJ4AagXuCwKYBabfTo0fJ4PGW2CQwM1Pjx42umIABwGIdNAdR6t9xyi3bv3q3CwsISn/d4PPr888/VunXrmi0MABzAnjcAtd64cePk51fyx5XH49Gtt95KcANQbxDeANR6I0eOLHWvm5+fn8aNG1fDFQGAcwhvAGq9Zs2aacCAAfL39y/x+fj4+BquCACcQ3gD4Apjx44ttszPz0+DBg3SDTfc4EBFAOAMwhsAV0hMTCzxvLeSQh0A1GWENwCuEBERoe9973sKCPj37Sn9/f117733OlgVANQ8whsA13jwwQdVUFAgSQoICNCwYcMUGRnpcFUAULMIbwBcY9iwYQoJCZEkFRQUaMyYMQ5XBAA1j/AGwDW8Xq/uu+8+SVJoaKiGDBnicEUAUPP4bVOgCp08eVIfffSR02XUabGxsZKkPn36aP369Q5XU7fFxsbqtttuc7oMAN/Az2MBVSglJUUjR450ugygSiQkJGjNmjVOlwHgG9jzBlQD/k9UvWbPnq2nn366yJWnqFqJiYlOlwCgFJzzBsB1CG4A6jPCGwDXIbgBqM8IbwAAAC5CeAMAAHARwhsAAICLEN4AAABchPAGAADgIoQ3AAAAFyG8AQAAuAjhDQAAwEUIbwAAAC5CeAMAAHARwhsAAICLEN4AAABchPAG1DITJ05UeHi4PB6Pdu/e7XQ5tUJhYaEWLlyofv36VXqM119/XW3btpXH4ynyCAoKUtOmTTVw4EAtWLBAFy5cqMLKAaDqEd6AWua1117Tq6++6nQZtcbhw4d1++2369FHH1VWVlalx4mPj9fnn3+udu3aKTIyUmamwsJCpaWlKSUlRW3atNHMmTPVuXNn7dixowpfAQBULcIbgGqVnZ1d6T1me/bs0eOPP67JkyerR48eVVyZ5PF4FBUVpYEDB2rZsmVKSUnRmTNnNHToUF28eLHK11fTrmfbA6i9CG9ALeTxeJwuocosXbpUaWlplerbvXt3vf766xozZoyCg4OruLLiEhISNH78eKWlpemVV16p9oPeucUAABB3SURBVPVVt+vZ9gBqL8Ib4DAz04IFC9SxY0cFBwcrMjJSM2bMKNLm17/+tUJDQxUeHq60tDRNnz5dLVq00MGDB2VmeuGFF/Stb31LwcHBatiwoYYPH64DBw74+v/mN7+R1+tV06ZNNWnSJDVv3lxer1f9+vXTtm3bitVzrfGmTp2qoKAgNWvWzLfs4YcfVoMGDeTxePTVV19Jkh555BFNnz5dR48elcfjUfv27atjE2rTpk2KiIjQvHnzrnus8ePHS5I2btwoiW0PoBYyAFUmOTnZKvq2euqpp8zj8djzzz9vFy5csKysLFu8eLFJsl27dhVpJ8mmTZtmL774ot1333322Wef2axZsywoKMhWrFhh6enplpqaar169bImTZrY6dOnff2TkpKsQYMG9umnn9qVK1ds//791qdPHwsPD7cTJ0742pV3vDFjxtgNN9xQ5LUsWLDAJNnZs2d9y+Lj461du3YV2iYl+fa3v23du3cv8bm33nrLwsPDbc6cOdccp127dhYZGVnq8xkZGSbJYmNjfcvq47ZPSEiwhISESvUFUL0Ib0AVqmh4y8rKstDQULvzzjuLLF+1alWp4S07O7tI/7CwMBs1alSR/h9//LFJKhJmkpKSioWW7du3myT7xS9+UeHxalN4q4hrhTczM4/HY1FRUb5/18dtT3gDai8OmwIOOnLkiLKysjR48OBK9d+/f78uXbqk3r17F1nep08fBQUFFTss9029e/dWaGio77Dc9Y5XF1y+fFlmpoiIiDLbse0BOIXwBjjo5MmTkqTo6OhK9U9PT5ckhYWFFXsuKipKmZmZ1xwjODhYZ8+erbLx3O7QoUOSpE6dOpXZjm0PwCmEN8BBXq9XkpSTk1Op/lFRUZJU4hd7enq6YmJiyuyfl5dXpN31jlcXbNq0SZI0ZMiQMtux7QE4hfAGOKhLly7y8/PT5s2bK90/LCys2E1lt23bptzcXN1yyy1l9n///fdlZurbt2+FxwsICFBeXl6l6q6tTp8+rYULFyomJkYPPfRQmW3Z9gCcQngDHBQdHa34+HitXbtWS5cuVUZGhlJTU7VkyZJy9fd6vZo+fbreeOMNrVy5UhkZGdq7d68mT56s5s2bKykpqUj7wsJCXbhwQfn5+UpNTdUjjzyili1b+m6PUZHx2rdvr/Pnz2vdunXKy8vT2bNndfz48WI1NmrUSKdOndKxY8eUmZlZLaFj48aNFbpViJnp0qVLKiwslJnp7NmzSk5OVv/+/eXv769169Zd85w3tj0Axzh6uQRQx1TmViGZmZk2ceJEa9y4sYWFhVlcXJzNmjXLJFlMTIzt2bPH5s+fbyEhIb5bWKxYscLXv7Cw0BYsWGAdOnSwwMBAa9iwoY0YMcIOHjxYZD1JSUkWGBhoLVq0sICAAIuIiLDhw4fb0aNHi7Qr73jnzp2zQYMGmdfrtTZt2thPf/pTmzFjhkmy9u3b+26BsXPnTmvVqpWFhIRYXFxckVteXMuWLVusf//+1rx5c5NkkqxZs2bWr18/27x5s6/dhg0bLDw83ObOnVvqWOvXr7du3bpZaGioBQUFmZ+fn0nyXVl666232pw5c+zcuXNF+tXXbc/VpkDt5TEzcyw5AnVMSkqKRo4cqdr4tpo0aZLWrFmjc+fOOV1KvePGbZ+YmChJWrNmjcOVAPgmDpsC9UhBQYHTJdRbbHsAVYXwBqDGHDhwQB6P55qPUaNGOV0qANRahDegHnjyySe1bNkyXbx4UW3atNHatWsdqaNTp06yf/2yS5mP1atXO1Jfdagt2x5A3cE5b0AVqs3nvAEVwTlvQO3FnjcAAAAXIbwBAAC4COENAADARQhvAAAALkJ4AwAAcBHCGwAAgIsQ3gAAAFyE8Ib/3979hdRd/3Ecf339c46e6TmucE3RSZok6GztQsqtEGIXY3d5NjXMXAwWXUbhaDFqtCLW8KatsLqJwo5bMGvkbgoG/YMCt1rLZP8sMXGJZe6ITn3/LqLz+/nb3LSpXz/6fMC58Hu+5/t5+wXHk+855zsAAOAQ4g0AAMAhxBsAAIBDiDcAAACHEG8AAAAOId4AAAAckuL3AMBy1NbW5vcIwG3p7e1VXl6e32MAuAHiDVgANTU1fo8A3LZoNOr3CABuwDMz83sIAJgLz/MUi8W0Y8cOv0cBgEXHZ94AAAAcQrwBAAA4hHgDAABwCPEGAADgEOINAADAIcQbAACAQ4g3AAAAhxBvAAAADiHeAAAAHEK8AQAAOIR4AwAAcAjxBgAA4BDiDQAAwCHEGwAAgEOINwAAAIcQbwAAAA4h3gAAABxCvAEAADiEeAMAAHAI8QYAAOAQ4g0AAMAhxBsAAIBDiDcAAACHEG8AAAAOId4AAAAcQrwBAAA4hHgDAABwCPEGAADgEOINAADAIcQbAACAQ4g3AAAAhxBvAAAADiHeAAAAHJLi9wAAcDMtLS0aGhq6bnt7e7suXbo0bVtjY6PuuuuuxRoNAHzhmZn5PQQAzGT37t1qaWlRMBhMbDMzeZ6X+HliYkKRSET9/f1KTU31Y0wAWDS8bQpgSaurq5MkjY2NJR7j4+PTfk5KSlJdXR3hBmBF4MobgCVtampKOTk5GhgYuOl+X3zxhTZt2rRIUwGAf7jyBmBJS0pKUn19vQKBwIz75OTkqLKychGnAgD/EG8Alry6ujqNj4/f8LnU1FQ1NDRM+wwcACxnvG0KwAmFhYXXfbv0H6dPn9Z99923yBMBgD+48gbACQ0NDTf8QkJhYSHhBmBFId4AOKG+vl7Xrl2bti01NVU7d+70aSIA8AdvmwJwRnl5uc6ePav//Weru7tbxcXFPk4FAIuLK28AnNHQ0KDk5GRJkud5uv/++wk3ACsO8QbAGY899pgmJyclScnJyXriiSd8nggAFh/xBsAZubm5qqyslOd5mpqa0vbt2/0eCQAWHfEGwCmPP/64zEwPP/ywcnNz/R4HABYdX1gAlqC2tjbV1NT4PQYcFY1GdfToUb/HALBAUvweAMDMYrGY3yMsSYcOHdLu3buVkZHh9yhLTnNzs98jAFhgxBuwhO3YscPvEZakyspK5eXl+T3GksQVN2D54zNvAJxDuAFYyYg3AAAAhxBvAAAADiHeAAAAHEK8AQAAOIR4AwAAcAjxBgAA4BDiDQAAwCHEGwAAgEOINwAAAIcQbwAAAA4h3gAAABxCvAEAADiEeAMAAHAI8QYsU7t27VJmZqY8z9Pp06f9Hudfu3btml555RXdc889CgQCysrKUllZmS5fvjyn43z00UcqLCyU53nTHoFAQGvWrFFVVZUOHjyooaGhhflFAGCeEG/AMvXOO+/o7bff9nuM21ZTU6P33ntPH3zwgeLxuH766ScVFRVpZGRkTseprq7WxYsXVVRUpEgkIjPT1NSUBgYG1NbWprvvvltNTU0qLS3Vd999t0C/DQDcvhS/BwCAmXz44Yc6fvy4zpw5o/Xr10uScnJy1N7ePi/H9zxPWVlZqqqqUlVVlbZt26aamhpt27ZN3d3dikQi87IOAMwnrrwBy5jneX6PcFvefPNNbdy4MRFuCy0ajaqxsVEDAwN66623FmVNAJgr4g1YJsxMBw8e1L333qtgMKhIJKLnnnvuuv0mJye1b98+rVu3Tunp6SovL1csFpMkHTlyRKtWrVIoFFJ7e7u2bt2qcDisvLw8tba2TjvOqVOnVFFRoVAopHA4rPXr12t4ePiWa8zW+Pi4vvnmG23YsOGW+548eVLhcFgHDhyY0xo30tjYKEnq6OhIbHPlnAFYIQzAkhOLxWyuf5579+41z/Ps0KFDNjQ0ZPF43A4fPmySrLOzM7Hfs88+a8Fg0I4dO2ZDQ0P2/PPPW1JSkn377beJ40iyzz77zP78808bGBiwhx56yFatWmXj4+NmZjYyMmLhcNhee+01Gx0dtf7+fnv00UftypUrs1pjNi5dumSSbMOGDVZVVWVr1661YDBoJSUl9sYbb9jU1FRi3xMnTlhmZqbt37//lsctKiqySCQy4/PDw8MmyfLz8507Z2Zm0WjUotHonF4DwC3EG7AEzTXe4vG4hUIh27Jly7Ttra2t0+JtdHTUQqGQ1dbWTnttMBi0p59+2sz+GyKjo6OJff6JwPPnz5uZ2dmzZ02SnThx4rpZZrPGbPzwww8mybZs2WJffvmlDQ4O2h9//GF79uwxSfb+++/P+lj/61bxZmbmeZ5lZWXN+vdZKufMjHgDVgLeNgWWgfPnzysej+uRRx656X4///yz4vG4ysrKEtvS09O1du1adXV1zfi6QCAg6e/bdkhSYWGh1qxZo/r6er344ovTbtvxb9f4f8FgUJJUWlqqyspK3XHHHYpEInrppZcUiUTU0tIy62PNxdWrV2VmCofDktw6ZwBWBuINWAZ6e3slSdnZ2Tfd7+rVq5KkF154Ydq9znp6ehSPx2e9Xnp6uj7//HNt3rxZBw4cUGFhoWprazU6Ojpva+Tk5EiSfv/992nbA4GACgoKdOHChVkfay66u7slSSUlJZLcOmcAVgbiDVgG0tLSJEljY2M33e+fuGtubpb9/bGJxOPrr7+e05qlpaX65JNP1NfXp6amJsViMb3++uvztkZGRoaKi4t17ty5656bmJhYsNt4nDx5UpK0detWSW6dMwArA/EGLANlZWVKSkrSqVOnbrpffn6+0tLSbvt/XOjr60tEVXZ2tl599VVt3LhR586dm7c1pL9v0NvZ2amLFy8mtsXjcfX09CzI7UP6+/vV3NysvLw8Pfnkk5LcO2cAlj/iDVgGsrOzVV1drWPHjundd9/V8PCwvv/+++s+F5aWlqadO3eqtbVVR44c0fDwsCYnJ9Xb26vffvtt1uv19fXpqaeeUldXl8bHx9XZ2amenh498MAD87aGJD3zzDMqKChQY2OjfvnlFw0ODqqpqUmjo6Pas2dPYr+Ojo453SrEzDQyMqKpqSmZma5cuaJYLKZNmzYpOTlZx48fT3zmzbVzBmAFWOQvSACYhX9zq5C//vrLdu3aZXfeeadlZGTY5s2bbd++fSbJ8vLy7MyZM2ZmNjY2Zk1NTbZu3TpLSUmx7Oxsq66uth9//NEOHz5soVDIJFlxcbFduHDBWlpaLBwOmyQrKCiw7u5uu3z5slVWVtrq1astOTnZcnNzbe/evTYxMXHLNebq119/tbq6Olu9erUFg0GrqKiwjo6Oaft8+umnlpmZaS+//PKMx/n444+tvLzcQqGQBQIBS0pKMkmJb5ZWVFTY/v37bXBw8LrXunTO+LYpsPx5ZmY+tiOAG2hra1NNTY3488Rcbd++XZJ09OhRnycBsFB42xQAAMAhxBuARdPV1TXtVhgzPWpra/0eFQCWrBS/BwCwcpSUlPBWMADcJq68AQAAOIR4AwAAcAjxBgAA4BDiDQAAwCHEGwAAgEOINwAAAIcQbwAAAA4h3gAAABxCvAEAADiEeAMAAHAI8QYAAOAQ4g0AAMAhxBsAAIBDiDcAAACHpPg9AICZeZ7n9whwUDQa9XsEAAvIMzPzewgA0/X29uqrr77yeww4Kj8/Xw8++KDfYwBYIMQbAACAQ/jMGwAAgEOINwAAAIcQbwAAAA5JkXTU7yEAAAAwO/8BGfpun4b6HcsAAAAASUVORK5CYII=\n",
            "text/plain": [
              "<IPython.core.display.Image object>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 99
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "oPPfr_cTWmEH"
      },
      "source": [
        "Now that's a good looking model. Let's compile it just as we have the rest of our models.\n",
        "\n",
        "> 🔑 **Note:** Section 4.2 of [*Neural Networks for Joint Sentence Classification\n",
        "in Medical Paper Abstracts*](https://arxiv.org/pdf/1612.05251.pdf) mentions using the SGD (stochastic gradient descent) optimizer, however, to stay consistent with our other models, we're going to use the Adam optimizer. As an exercise, you could try using [`tf.keras.optimizers.SGD`](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/SGD) instead of [`tf.keras.optimizers.Adam`](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Adam) and compare the results."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "4Yx8PFSc2hqE"
      },
      "source": [
        "# Compile token char model\n",
        "model_4.compile(loss=\"categorical_crossentropy\",\n",
        "                optimizer=tf.keras.optimizers.Adam(), # section 4.2 of https://arxiv.org/pdf/1612.05251.pdf mentions using SGD but we'll stick with Adam\n",
        "                metrics=[\"accuracy\"])"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "f-bD7bL-UIn3"
      },
      "source": [
        "And again, to keep our experiments fast, we'll fit our token-character-hybrid model on 10% of training and validate on 10% of validation batches. However, the difference with this model is that it requires two inputs, token-level sequences and character-level sequences.\n",
        "\n",
        "We can do this by create a `tf.data.Dataset` with a tuple as it's first input, for example:\n",
        "* `((token_data, char_data), (label))`\n",
        "\n",
        "Let's see it in action.\n",
        "\n",
        "### Combining token and character data into a `tf.data` dataset"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "pYU0fX6rpbgI"
      },
      "source": [
        "# Combine chars and tokens into a dataset\n",
        "train_char_token_data = tf.data.Dataset.from_tensor_slices((train_sentences, train_chars)) # make data\n",
        "train_char_token_labels = tf.data.Dataset.from_tensor_slices(train_labels_one_hot) # make labels\n",
        "train_char_token_dataset = tf.data.Dataset.zip((train_char_token_data, train_char_token_labels)) # combine data and labels\n",
        "\n",
        "# Prefetch and batch train data\n",
        "train_char_token_dataset = train_char_token_dataset.batch(32).prefetch(tf.data.AUTOTUNE) \n",
        "\n",
        "# Repeat same steps validation data\n",
        "val_char_token_data = tf.data.Dataset.from_tensor_slices((val_sentences, val_chars))\n",
        "val_char_token_labels = tf.data.Dataset.from_tensor_slices(val_labels_one_hot)\n",
        "val_char_token_dataset = tf.data.Dataset.zip((val_char_token_data, val_char_token_labels))\n",
        "val_char_token_dataset = val_char_token_dataset.batch(32).prefetch(tf.data.AUTOTUNE)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "UlOs99Emp52r",
        "outputId": "944b3f30-183f-4b86-a559-0b1ce2031558"
      },
      "source": [
        "# Check out training char and token embedding dataset\n",
        "train_char_token_dataset, val_char_token_dataset"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(<PrefetchDataset shapes: (((None,), (None,)), (None, 5)), types: ((tf.string, tf.string), tf.float64)>,\n",
              " <PrefetchDataset shapes: (((None,), (None,)), (None, 5)), types: ((tf.string, tf.string), tf.float64)>)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 102
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ANLBMpRlfA73"
      },
      "source": [
        "### Fitting a model on token and character-level sequences"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "yp0c25coprwp",
        "outputId": "21353731-89af-4248-eb12-86057213d800"
      },
      "source": [
        "# Fit the model on tokens and chars\n",
        "model_4_history = model_4.fit(train_char_token_dataset, # train on dataset of token and characters\n",
        "                              steps_per_epoch=int(0.1 * len(train_char_token_dataset)),\n",
        "                              epochs=3,\n",
        "                              validation_data=val_char_token_dataset,\n",
        "                              validation_steps=int(0.1 * len(val_char_token_dataset)))"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Epoch 1/3\n",
            "562/562 [==============================] - 34s 48ms/step - loss: 1.1350 - accuracy: 0.5489 - val_loss: 0.7629 - val_accuracy: 0.7071\n",
            "Epoch 2/3\n",
            "562/562 [==============================] - 24s 44ms/step - loss: 0.8025 - accuracy: 0.6950 - val_loss: 0.7007 - val_accuracy: 0.7304\n",
            "Epoch 3/3\n",
            "562/562 [==============================] - 22s 40ms/step - loss: 0.7630 - accuracy: 0.7054 - val_loss: 0.6777 - val_accuracy: 0.7400\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "BfAMuoJett_t",
        "outputId": "8a8050c8-1b9c-40a8-dd85-a2864727475a"
      },
      "source": [
        "# Evaluate on the whole validation dataset\n",
        "model_4.evaluate(val_char_token_dataset)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "945/945 [==============================] - 19s 21ms/step - loss: 0.6827 - accuracy: 0.7403\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "[0.6827240586280823, 0.7402687668800354]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 104
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uSimi5vYY2xF"
      },
      "source": [
        "Nice! Our token-character hybrid model has come to life!\n",
        "\n",
        "To make predictions with it, since it takes multiplie inputs, we can pass the `predict()` method a tuple of token-level sequences and character-level sequences.\n",
        "\n",
        "We can then evaluate the predictions as we've done before."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "1z_zbrXTYN7G",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "2c0d7a4d-aae0-484b-98d8-ff9f985240ca"
      },
      "source": [
        "# Make predictions using the token-character model hybrid\n",
        "model_4_pred_probs = model_4.predict(val_char_token_dataset)\n",
        "model_4_pred_probs"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "array([[4.6708044e-01, 2.6864669e-01, 3.2527896e-03, 2.5465280e-01,\n",
              "        6.3672690e-03],\n",
              "       [3.1502637e-01, 5.0380874e-01, 2.1617012e-03, 1.7747717e-01,\n",
              "        1.5259499e-03],\n",
              "       [3.3295774e-01, 1.4105824e-01, 4.9620651e-02, 4.4369906e-01,\n",
              "        3.2664366e-02],\n",
              "       ...,\n",
              "       [4.7793266e-04, 8.2938904e-03, 7.7614605e-02, 2.1132462e-04,\n",
              "        9.1340226e-01],\n",
              "       [6.4677000e-03, 4.8982769e-02, 2.1276093e-01, 3.5061575e-03,\n",
              "        7.2828245e-01],\n",
              "       [2.9351631e-01, 5.2124566e-01, 1.1931888e-01, 2.7323093e-02,\n",
              "        3.8596042e-02]], dtype=float32)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 105
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Ic5MCrFxYgsB",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "96e4e1d1-5221-4dde-9c66-868018499324"
      },
      "source": [
        "# Turn prediction probabilities into prediction classes\n",
        "model_4_preds = tf.argmax(model_4_pred_probs, axis=1)\n",
        "model_4_preds"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<tf.Tensor: shape=(30212,), dtype=int64, numpy=array([0, 1, 3, ..., 4, 4, 1])>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 106
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "CBNPIRIC7EsE",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "ddaefe11-bca6-4dcc-af8d-8c69bca867f2"
      },
      "source": [
        "# Get results of token-char-hybrid model\n",
        "model_4_results = calculate_results(y_true=val_labels_encoded,\n",
        "                                    y_pred=model_4_preds)\n",
        "model_4_results"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "{'accuracy': 74.02687673772012,\n",
              " 'f1': 0.7380992152112655,\n",
              " 'precision': 0.7403019084403328,\n",
              " 'recall': 0.7402687673772012}"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 107
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wU5ctbxCih6Z"
      },
      "source": [
        "## Model 5: Transfer Learning with pretrained token embeddings + character embeddings + positional embeddings \n",
        "\n",
        "It seems like combining token embeddings and character embeddings gave our model a little performance boost.\n",
        "\n",
        "But there's one more piece of the puzzle we can add in.\n",
        "\n",
        "What if we engineered our own features into the model?\n",
        "\n",
        "Meaning, what if we took our own knowledge about the data and encoded it in a numerical way to give our model more information about our samples?\n",
        "\n",
        "The process of applying your own knowledge to build features as input to a model is called **feature engineering**.\n",
        "\n",
        "Can you think of something important about the sequences we're trying to classify?\n",
        "\n",
        "If you were to look at an abstract, would you expect the sentences to appear in order? Or does it make sense if they were to appear sequentially? For example, sequences labelled `CONCLUSIONS` at the beggining and sequences labelled `OBJECTIVE` at the end?\n",
        "\n",
        "Abstracts typically come in a sequential order, such as:\n",
        "* `OBJECTIVE` ...\n",
        "* `METHODS` ...\n",
        "* `METHODS` ...\n",
        "* `METHODS` ...\n",
        "* `RESULTS` ...\n",
        "* `CONCLUSIONS` ...\n",
        "\n",
        "Or\n",
        "\n",
        "* `BACKGROUND` ...\n",
        "* `OBJECTIVE` ...\n",
        "* `METHODS` ...\n",
        "* `METHODS` ...\n",
        "* `RESULTS` ...\n",
        "* `RESULTS` ...\n",
        "* `CONCLUSIONS` ...\n",
        "* `CONCLUSIONS` ...\n",
        "\n",
        "Of course, we can't engineer the sequence labels themselves into the training data (we don't have these at test time), but we can encode the order of a set of sequences in an abstract.\n",
        "\n",
        "For example,\n",
        "* `Sentence 1 of 10` ...\n",
        "* `Sentence 2 of 10` ...\n",
        "* `Sentence 3 of 10` ...\n",
        "* `Sentence 4 of 10` ...\n",
        "* ...\n",
        "\n",
        "\n",
        "You might've noticed this when we created our `preprocess_text_with_line_numbers()` function. When we read in a text file of abstracts, we counted the number of lines in an abstract as well as the number of each line itself.\n",
        "\n",
        "Doing this led to the `\"line_number\"` and `\"total_lines\"` columns of our DataFrames."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Htf-tnFcEcAn",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 195
        },
        "outputId": "d0329931-45c6-4fda-e437-d48b3fbda837"
      },
      "source": [
        "# Inspect training dataframe\n",
        "train_df.head()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>target</th>\n",
              "      <th>text</th>\n",
              "      <th>line_number</th>\n",
              "      <th>total_lines</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>OBJECTIVE</td>\n",
              "      <td>to investigate the efficacy of @ weeks of dail...</td>\n",
              "      <td>0</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>METHODS</td>\n",
              "      <td>a total of @ patients with primary knee oa wer...</td>\n",
              "      <td>1</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>METHODS</td>\n",
              "      <td>outcome measures included pain reduction and i...</td>\n",
              "      <td>2</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>METHODS</td>\n",
              "      <td>pain was assessed using the visual analog pain...</td>\n",
              "      <td>3</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>METHODS</td>\n",
              "      <td>secondary outcome measures included the wester...</td>\n",
              "      <td>4</td>\n",
              "      <td>11</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "      target  ... total_lines\n",
              "0  OBJECTIVE  ...          11\n",
              "1    METHODS  ...          11\n",
              "2    METHODS  ...          11\n",
              "3    METHODS  ...          11\n",
              "4    METHODS  ...          11\n",
              "\n",
              "[5 rows x 4 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 108
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "IZ5HvKoiGU6m"
      },
      "source": [
        "The `\"line_number\"` and `\"total_lines\"` columns are features which didn't necessarily come with the training data but can be passed to our model as a **positional embedding**. In other words, the positional embedding is where the sentence appears in an abstract.\n",
        "\n",
        "We can use these features because they will be available at test time. \n",
        "\n",
        "![example of engineering features into our dataset to help our model](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/images/09-engineered-features-at-test-time.png)\n",
        "*Since abstracts typically have a sequential order about them (for example, background, objective, methods, results, conclusion), it makes sense to add the line number of where a particular sentence occurs to our model. The beautiful thing is, these features will be available at test time (we can just count the number of sentences in an abstract and the number of each one).*\n",
        "\n",
        "Meaning, if we were to predict the labels of sequences in an abstract our model had never seen, we could count the number of lines and the track the position of each individual line and pass it to our model.\n",
        "\n",
        "> 🛠 **Exercise:** Another way of creating our positional embedding feature would be to combine the `\"line_number\"` and `\"total_lines\"` columns into one, for example a `\"line_position\"` column may contain values like `1_of_11`, `2_of_11`, etc. Where `1_of_11` would be the first line in an abstract 11 sentences long. After going through the following steps, you might want to revisit this positional embedding stage and see how a combined column of `\"line_position\"` goes against two separate columns."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ABuz5baDJwY-"
      },
      "source": [
        "### Create positional embeddings\n",
        "\n",
        "Okay, enough talk about positional embeddings, let's create them.\n",
        "\n",
        "Since our `\"line_number\"` and `\"total_line\"` columns are already numerical, we could pass them as they are to our model.\n",
        "\n",
        "But to avoid our model thinking a line with `\"line_number\"=5` is five times greater than a line with `\"line_number\"=1`, we'll use one-hot-encoding to encode our `\"line_number\"` and `\"total_lines\"` features.\n",
        "\n",
        "To do this, we can use the [`tf.one_hot`](https://www.tensorflow.org/api_docs/python/tf/one_hot) utility.\n",
        "\n",
        "`tf.one_hot` returns a one-hot-encoded tensor. It accepts an array (or tensor) as input and the `depth` parameter determines the dimension of the returned tensor.\n",
        "\n",
        "To figure out what we should set the `depth` parameter to, let's investigate the distribution of the `\"line_number\"` column.\n",
        "\n",
        "> 🔑 **Note:** When it comes to one-hot-encoding our features, Scikit-Learn's [`OneHotEncoder`](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html) class is another viable option here."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "LJVhuU7cMd0-",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "aaf65f6e-dc49-49f5-e82e-7c54d5ed0d06"
      },
      "source": [
        "# How many different line numbers are there?\n",
        "train_df[\"line_number\"].value_counts()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "0     15000\n",
              "1     15000\n",
              "2     15000\n",
              "3     15000\n",
              "4     14992\n",
              "5     14949\n",
              "6     14758\n",
              "7     14279\n",
              "8     13346\n",
              "9     11981\n",
              "10    10041\n",
              "11     7892\n",
              "12     5853\n",
              "13     4152\n",
              "14     2835\n",
              "15     1861\n",
              "16     1188\n",
              "17      751\n",
              "18      462\n",
              "19      286\n",
              "20      162\n",
              "21      101\n",
              "22       66\n",
              "23       33\n",
              "24       22\n",
              "25       14\n",
              "26        7\n",
              "27        4\n",
              "28        3\n",
              "29        1\n",
              "30        1\n",
              "Name: line_number, dtype: int64"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 109
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "rKoNMSBNImLG",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 282
        },
        "outputId": "f35a58e5-936a-4a7e-e7fe-04b4146e6b41"
      },
      "source": [
        "# Check the distribution of \"line_number\" column\n",
        "train_df.line_number.plot.hist()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<matplotlib.axes._subplots.AxesSubplot at 0x7fe73ad38550>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 110
        },
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZEAAAD4CAYAAAAtrdtxAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAASwElEQVR4nO3df9CdZX3n8ffHAAVtFShZliHQYM3UTV2rGIGO7a6LIwZphXbVwtQ16zCmM+KMTveH0eks1pYZ3NkWS0fd0pJpcNtGqlayBYeNiv3xBz+CoAiU8hTDkoiQGhCpFjb43T/O9cAxPnlyciXnOc/J837NnHnu+3tf97mva+7kfOb+ce6TqkKSpB7Pm3QHJEnTyxCRJHUzRCRJ3QwRSVI3Q0SS1O2ISXdgoZ1wwgm1cuXKSXdDkqbG7bff/o9VtXyuZUsuRFauXMm2bdsm3Q1JmhpJHtzXMk9nSZK6GSKSpG6GiCSpmyEiSepmiEiSuhkikqRuhogkqZshIknqZohIkrotuW+sH4yVG66fdBcW3PbLz5t0FyQtYh6JSJK6GSKSpG6GiCSpmyEiSepmiEiSuhkikqRuhogkqZshIknqZohIkroZIpKkboaIJKmbz87SvCb1vDCf2SVNB49EJEndDBFJUjdDRJLUzRCRJHUzRCRJ3QwRSVI3Q0SS1G3sIZJkWZI7kvxlmz8tyS1JZpJ8MslRrf4jbX6mLV859B7vb/X7krxhqL621WaSbBj3WCRJP2ghjkTeA9w7NP9h4IqqegnwGHBxq18MPNbqV7R2JFkNXAj8NLAW+FgLpmXAR4FzgdXARa2tJGmBjDVEkqwAzgP+qM0HOBv4VGuyCbigTZ/f5mnLX9fanw9srqqnqurrwAxwRnvNVNUDVfU0sLm1lSQtkHEfiXwE+K/A99v8jwOPV9WeNr8DOLlNnww8BNCWf7u1f7a+1zr7qv+QJOuTbEuybdeuXQc7JklSM7YQSfILwKNVdfu4tjGqqrqqqtZU1Zrly5dPujuSdNgY5wMYXwO8KckbgaOBFwK/Bxyb5Ih2tLEC2Nna7wROAXYkOQJ4EfCtofqs4XX2VZckLYCxHYlU1furakVVrWRwYfyLVfWrwE3Am1uzdcB1bXpLm6ct/2JVVatf2O7eOg1YBdwK3Aasand7HdW2sWVc45Ek/bBJPAr+fcDmJL8N3AFc3epXA59IMgPsZhAKVNXdSa4F7gH2AJdU1TMASd4N3AgsAzZW1d0LOhJJWuIWJESq6kvAl9r0AwzurNq7zT8Db9nH+pcBl81RvwG44RB2VZJ0APzGuiSpmyEiSepmiEiSuhkikqRuhogkqZshIknqZohIkroZIpKkboaIJKmbISJJ6maISJK6GSKSpG6GiCSpmyEiSepmiEiSuhkikqRuhogkqZshIknqZohIkroZIpKkboaIJKmbISJJ6maISJK6GSKSpG6GiCSpmyEiSepmiEiSuhkikqRuhogkqZshIknqZohIkroZIpKkboaIJKmbISJJ6maISJK6GSKSpG6GiCSp29hCJMnRSW5N8pUkdyf5zVY/LcktSWaSfDLJUa3+I21+pi1fOfRe72/1+5K8Yai+ttVmkmwY11gkSXMb55HIU8DZVfUzwCuAtUnOAj4MXFFVLwEeAy5u7S8GHmv1K1o7kqwGLgR+GlgLfCzJsiTLgI8C5wKrgYtaW0nSAhlbiNTAk232yPYq4GzgU62+CbigTZ/f5mnLX5ckrb65qp6qqq8DM8AZ7TVTVQ9U1dPA5tZWkrRAjhjnm7ejhduBlzA4avgH4PGq2tOa7ABObtMnAw8BVNWeJN8GfrzVbx562+F1HtqrfuY++rEeWA9w6qmnHtygtCBWbrh+Ytvefvl5E9u2NG3GemG9qp6pqlcAKxgcObx0nNubpx9XVdWaqlqzfPnySXRBkg5LC3J3VlU9DtwE/CxwbJLZI6AVwM42vRM4BaAtfxHwreH6Xuvsqy5JWiDjvDtreZJj2/QxwOuBexmEyZtbs3XAdW16S5unLf9iVVWrX9ju3joNWAXcCtwGrGp3ex3F4OL7lnGNR5L0w8Z5TeQkYFO7LvI84Nqq+ssk9wCbk/w2cAdwdWt/NfCJJDPAbgahQFXdneRa4B5gD3BJVT0DkOTdwI3AMmBjVd09xvFIkvYythCpqq8Cr5yj/gCD6yN71/8ZeMs+3usy4LI56jcANxx0ZyVJXUY6nZXkX4+7I5Kk6TPqNZGPtW+fvyvJi8baI0nS1BgpRKrq54FfZXA31O1J/jTJ68faM0nSojfy3VlVdT/wG8D7gH8LXJnk75L88rg6J0la3Ea9JvLyJFcwuEX3bOAXq+pftekrxtg/SdIiNurdWb8P/BHwgar63myxqr6R5DfG0jNJ0qI3aoicB3xv6PsZzwOOrqrvVtUnxtY7SdKiNuo1kc8DxwzNP7/VJElL2KghcvTQY91p088fT5ckSdNi1BD5pySnz84keRXwvXnaS5KWgFGvibwX+PMk3wAC/EvgV8bWK0nSVBgpRKrqtiQvBX6qle6rqv83vm5JkqbBgTyA8dXAyrbO6UmoqmvG0itJ0lQYKUSSfAL4SeBO4JlWLsAQkaQlbNQjkTXA6vYjUZIkAaPfnfU1BhfTJUl61qhHIicA9yS5FXhqtlhVbxpLryRJU2HUEPngODshSZpOo97i+1dJfgJYVVWfT/J8Br9rLklawkZ9FPw7gU8Bf9BKJwOfHVenJEnTYdQL65cArwGegGd/oOpfjKtTkqTpMGqIPFVVT8/OJDmCwfdEJElL2Kgh8ldJPgAc035b/c+B/z2+bkmSpsGoIbIB2AXcBfwacAOD31uXJC1ho96d9X3gD9tLkiRg9GdnfZ05roFU1YsPeY8kSVPjQJ6dNeto4C3A8Ye+O5KkaTLSNZGq+tbQa2dVfQQ4b8x9kyQtcqOezjp9aPZ5DI5MDuS3SCRJh6FRg+B3hqb3ANuBtx7y3kiSpsqod2f9u3F3RJI0fUY9nfXr8y2vqt89NN2RJE2TA7k769XAljb/i8CtwP3j6JQkaTqMGiIrgNOr6jsAST4IXF9VbxtXxyRJi9+ojz05EXh6aP7pVpMkLWGjHolcA9ya5C/a/AXApvF0SZI0LUa9O+uyJJ8Dfr6V3lFVd4yvW5KkaTDq6SyA5wNPVNXvATuSnDZf4ySnJLkpyT1J7k7ynlY/PsnWJPe3v8e1epJcmWQmyVeHv+CYZF1rf3+SdUP1VyW5q61zZZIc0OglSQdl1J/HvRR4H/D+VjoS+F/7WW0P8J+qajVwFnBJktUMHiv/hapaBXyhzQOcC6xqr/XAx9u2jwcuBc4EzgAunQ2e1uadQ+utHWU8kqRDY9QjkV8C3gT8E0BVfQP4sflWqKqHq+rLbfo7wL0Mfpv9fJ67nrKJwfUVWv2aGrgZODbJScAbgK1VtbuqHgO2AmvbshdW1c1VVQyu28y+lyRpAYwaIk+3D+oCSPKCA9lIkpXAK4FbgBOr6uG26Js8d5fXycBDQ6vtaLX56jvmqM+1/fVJtiXZtmvXrgPpuiRpHqOGyLVJ/oDB0cE7gc8z4g9UJflR4NPAe6vqieFlw8E0TlV1VVWtqao1y5cvH/fmJGnJ2O/dWe1i9SeBlwJPAD8F/Leq2jrCukcyCJA/qarPtPIjSU6qqofbKalHW30ncMrQ6itabSfw2r3qX2r1FXO0lyQtkP0eibSjhRuqamtV/Zeq+s8jBkiAq4F793q21hZg9g6rdcB1Q/W3t7u0zgK+3U573Qick+S4dkH9HODGtuyJJGe1bb196L0kSQtg1C8bfjnJq6vqtgN479cA/wG4K8mdrfYB4HIGp8cuBh7kuUfK3wC8EZgBvgu8A6Cqdif5LWB22x+qqt1t+l3AHwPHAJ9rL0nSAhk1RM4E3pZkO4M7tMLgIOXl+1qhqv62tZvL6+ZoX8Al+3ivjcDGOerbgJftr/OSpPGYN0SSnFpV/5fBbbaSJP2A/R2JfJbB03sfTPLpqvr3C9EpSdJ02N+F9eHTUS8eZ0ckSdNnfyFS+5iWJGm/p7N+JskTDI5IjmnT8NyF9ReOtXeSpEVt3hCpqmUL1RFJ0vQ5kEfBS5L0AwwRSVI3Q0SS1M0QkSR1M0QkSd0MEUlSN0NEktTNEJEkdTNEJEndDBFJUjdDRJLUzRCRJHUzRCRJ3QwRSVI3Q0SS1M0QkSR1M0QkSd0MEUlSN0NEktTNEJEkdTNEJEndjph0B6TFZuWG6yey3e2XnzeR7UoHwyMRSVI3Q0SS1M0QkSR1M0QkSd0MEUlSN0NEktTNEJEkdTNEJEndDBFJUrexhUiSjUkeTfK1odrxSbYmub/9Pa7Vk+TKJDNJvprk9KF11rX29ydZN1R/VZK72jpXJsm4xiJJmts4j0T+GFi7V20D8IWqWgV8oc0DnAusaq/1wMdhEDrApcCZwBnApbPB09q8c2i9vbclSRqzsYVIVf01sHuv8vnApja9CbhgqH5NDdwMHJvkJOANwNaq2l1VjwFbgbVt2Qur6uaqKuCaofeSJC2Qhb4mcmJVPdymvwmc2KZPBh4aarej1ear75ijPqck65NsS7Jt165dBzcCSdKzJnZhvR1B1AJt66qqWlNVa5YvX74Qm5SkJWGhQ+SRdiqK9vfRVt8JnDLUbkWrzVdfMUddkrSAFjpEtgCzd1itA64bqr+93aV1FvDtdtrrRuCcJMe1C+rnADe2ZU8kOavdlfX2ofeSJC2Qsf0oVZI/A14LnJBkB4O7rC4Hrk1yMfAg8NbW/AbgjcAM8F3gHQBVtTvJbwG3tXYfqqrZi/XvYnAH2DHA59pLkrSAxhYiVXXRPha9bo62BVyyj/fZCGyco74NeNnB9FGSdHD8xrokqZshIknqZohIkroZIpKkboaIJKmbISJJ6maISJK6GSKSpG6GiCSpmyEiSepmiEiSuhkikqRuhogkqZshIknqZohIkroZIpKkboaIJKmbISJJ6maISJK6GSKSpG6GiCSpmyEiSep2xKQ7IGlg5YbrJ7Ld7ZefN5Ht6vDgkYgkqZshIknqZohIkroZIpKkboaIJKmbISJJ6maISJK6GSKSpG6GiCSpmyEiSepmiEiSuhkikqRuhogkqZtP8ZWWuEk9PRh8gvDhYOqPRJKsTXJfkpkkGybdH0laSqY6RJIsAz4KnAusBi5KsnqyvZKkpWPaT2edAcxU1QMASTYD5wP3TLRXkkbiD3FNv2kPkZOBh4bmdwBn7t0oyXpgfZt9Msl9nds7AfjHznUXm8NlLIfLOMCxLJh8eOSmi3ocB+hgxvIT+1ow7SEykqq6CrjqYN8nybaqWnMIujRxh8tYDpdxgGNZjA6XccD4xjLV10SAncApQ/MrWk2StACmPURuA1YlOS3JUcCFwJYJ90mSloypPp1VVXuSvBu4EVgGbKyqu8e4yYM+JbaIHC5jOVzGAY5lMTpcxgFjGkuqahzvK0laAqb9dJYkaYIMEUlSN0NkBIfTo1WSbE9yV5I7k2ybdH8ORJKNSR5N8rWh2vFJtia5v/09bpJ9HNU+xvLBJDvbvrkzyRsn2cdRJDklyU1J7klyd5L3tPrU7Zd5xjKN++XoJLcm+Uoby2+2+mlJbmmfZZ9sNyQd3La8JjK/9miVvwdez+DLjLcBF1XVVH4rPsl2YE1VTd0XqJL8G+BJ4Jqqelmr/Xdgd1Vd3gL+uKp63yT7OYp9jOWDwJNV9T8m2bcDkeQk4KSq+nKSHwNuBy4A/iNTtl/mGctbmb79EuAFVfVkkiOBvwXeA/w68Jmq2pzkfwJfqaqPH8y2PBLZv2cfrVJVTwOzj1bRAquqvwZ271U+H9jUpjcx+E+/6O1jLFOnqh6uqi+36e8A9zJ4ksTU7Zd5xjJ1auDJNntkexVwNvCpVj8k+8UQ2b+5Hq0ylf+wmgL+T5Lb2+Ngpt2JVfVwm/4mcOIkO3MIvDvJV9vprkV/CmhYkpXAK4FbmPL9stdYYAr3S5JlSe4EHgW2Av8APF5Ve1qTQ/JZZogsPT9XVaczePLxJe20ymGhBudmp/n87MeBnwReATwM/M5kuzO6JD8KfBp4b1U9Mbxs2vbLHGOZyv1SVc9U1SsYPMnjDOCl49iOIbJ/h9WjVapqZ/v7KPAXDP5xTbNH2rns2XPaj064P92q6pH2H//7wB8yJfumnXP/NPAnVfWZVp7K/TLXWKZ1v8yqqseBm4CfBY5NMvsl80PyWWaI7N9h82iVJC9oFwxJ8gLgHOBr86+16G0B1rXpdcB1E+zLQZn90G1+iSnYN+0C7tXAvVX1u0OLpm6/7GssU7pflic5tk0fw+DGoHsZhMmbW7NDsl+8O2sE7Za+j/Dco1Uum3CXuiR5MYOjDxg88uZPp2ksSf4MeC2DR1o/AlwKfBa4FjgVeBB4a1Ut+gvW+xjLaxmcMilgO/BrQ9cVFqUkPwf8DXAX8P1W/gCDawlTtV/mGctFTN9+eTmDC+fLGBwsXFtVH2qfAZuB44E7gLdV1VMHtS1DRJLUy9NZkqRuhogkqZshIknqZohIkroZIpKkboaIJKmbISJJ6vb/AVwSphAAsBgmAAAAAElFTkSuQmCC\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": [],
            "needs_background": "light"
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pPnEkJvXKuf9"
      },
      "source": [
        "Looking at the distribution of the `\"line_number\"` column, it looks like the majority of lines have a position of 15 or less.\n",
        "\n",
        "Knowing this, let's set the `depth` parameter of `tf.one_hot` to 15."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "LsjdKcXUMkgE"
      },
      "source": [
        "# Use TensorFlow to create one-hot-encoded tensors of our \"line_number\" column \n",
        "train_line_numbers_one_hot = tf.one_hot(train_df[\"line_number\"].to_numpy(), depth=15)\n",
        "val_line_numbers_one_hot = tf.one_hot(val_df[\"line_number\"].to_numpy(), depth=15)\n",
        "test_line_numbers_one_hot = tf.one_hot(test_df[\"line_number\"].to_numpy(), depth=15)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "7MTRSo_OLWGS"
      },
      "source": [
        "Setting the `depth` parameter of `tf.one_hot` to 15 means any sample with a `\"line_number\"` value of over 15 gets set to a tensor of all 0's, where as any sample with a `\"line_number\"` of under 15 gets turned into a tensor of all 0's but with a 1 at the index equal to the `\"line_number\"` value.\n",
        "\n",
        "> 🔑 **Note:** We could create a one-hot tensor which has room for all of the potential values of `\"line_number\"` (`depth=30`), however, this would end up in a tensor of double the size of our current one (`depth=15`) where the vast majority of values are 0. Plus, only ~2,000/180,000 samples have a `\"line_number\"` value of over 15. So we would not be gaining much information about our data for doubling our feature space. This kind of problem is called the **curse of dimensionality**. However, since this we're working with deep models, it might be worth trying to throw as much information at the model as possible and seeing what happens. I'll leave exploring values of the `depth` parameter as an extension."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "R7BERNOQK723",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "c023e1d4-1f86-4df4-e08a-42c51c75441a"
      },
      "source": [
        "# Check one-hot encoded \"line_number\" feature samples\n",
        "train_line_numbers_one_hot.shape, train_line_numbers_one_hot[:20]"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(TensorShape([180040, 15]), <tf.Tensor: shape=(20, 15), dtype=float32, numpy=\n",
              " array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],\n",
              "        [1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.]],\n",
              "       dtype=float32)>)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 112
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CxYgKu6tMBbg"
      },
      "source": [
        "We can do the same as we've done for our `\"line_number\"` column witht he `\"total_lines\"` column. First, let's find an appropriate value for the `depth` parameter of `tf.one_hot`."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "S3bLbdWzOBmY",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "2c928023-e0a7-4284-d254-e174b435a811"
      },
      "source": [
        "# How many different numbers of lines are there?\n",
        "train_df[\"total_lines\"].value_counts()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "11    24468\n",
              "10    23639\n",
              "12    22113\n",
              "9     19400\n",
              "13    18438\n",
              "14    14610\n",
              "8     12285\n",
              "15    10768\n",
              "7      7464\n",
              "16     7429\n",
              "17     5202\n",
              "6      3353\n",
              "18     3344\n",
              "19     2480\n",
              "20     1281\n",
              "5      1146\n",
              "21      770\n",
              "22      759\n",
              "23      264\n",
              "4       215\n",
              "24      200\n",
              "25      182\n",
              "26       81\n",
              "28       58\n",
              "3        32\n",
              "30       31\n",
              "27       28\n",
              "Name: total_lines, dtype: int64"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 113
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "oxDN9ASLL9uY",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 267
        },
        "outputId": "b3d59c36-e9ab-4798-b7c8-4cd91004b1d4"
      },
      "source": [
        "# Check the distribution of total lines\n",
        "train_df.total_lines.plot.hist();"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZEAAAD6CAYAAABgZXp6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAXpUlEQVR4nO3df7BfdX3n8efLRCpSkVDSLJNgg21Gl7r+gCvg1HatjCHg1tBdl4WtS5ZhiDNgV8f9QXQ6i8Uyk+5spdJatqlkTVwV8SfZEppGxHb7Bz+CIAjo5IqwJAJJDRDRFhZ97x/fz5Wv4ebyzbn53i/35vmY+c49530+55zPZ74TXpxzPt/vN1WFJEldvGjUHZAkzV6GiCSpM0NEktSZISJJ6swQkSR1ZohIkjobWogkeVWSO/tee5O8L8nRSbYm2d7+Lmjtk+TKJONJ7kpyYt+xVrX225Os6quflOTuts+VSTKs8UiSnisz8TmRJPOAncApwMXAnqpam2QNsKCqLklyJvC7wJmt3Uer6pQkRwPbgDGggNuBk6rqsSS3Av8BuAXYDFxZVTdM1Zdjjjmmli5dOpRxStJcdPvtt/99VS2cbNv8GerDacB3qurBJCuBt7T6BuBrwCXASmBj9VLt5iRHJTm2td1aVXsAkmwFViT5GnBkVd3c6huBs4ApQ2Tp0qVs27bt4I5OkuawJA/ub9tMPRM5B/hMW15UVQ+35UeARW15MfBQ3z47Wm2q+o5J6pKkGTL0EElyGPAO4HP7bmtXHUO/n5ZkdZJtSbbt3r172KeTpEPGTFyJnAF8vaoebeuPtttUtL+7Wn0ncFzffktabar6kknqz1FV66pqrKrGFi6c9LaeJKmDmQiRc3n2VhbAJmBihtUq4Lq++nltltapwBPtttcWYHmSBW0m13JgS9u2N8mpbVbWeX3HkiTNgKE+WE9yBPA24N195bXAtUkuAB4Ezm71zfRmZo0DPwLOB6iqPUk+DNzW2l028ZAduAj4BHA4vQfqUz5UlyQdXDMyxfeFZGxsrJydJUmDS3J7VY1Nts1PrEuSOjNEJEmdGSKSpM5m6hPrmqWWrrl+JOd9YO3bR3JeSQfGKxFJUmeGiCSpM0NEktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSps6GGSJKjknw+ybeS3JfkTUmOTrI1yfb2d0FrmyRXJhlPcleSE/uOs6q1355kVV/9pCR3t32uTJJhjkeS9LOGfSXyUeCvqurVwOuA+4A1wI1VtQy4sa0DnAEsa6/VwFUASY4GLgVOAU4GLp0Intbmwr79Vgx5PJKkPkMLkSQvB34DuBqgqp6uqseBlcCG1mwDcFZbXglsrJ6bgaOSHAucDmytqj1V9RiwFVjRth1ZVTdXVQEb+44lSZoBw7wSOR7YDfzPJHck+XiSI4BFVfVwa/MIsKgtLwYe6tt/R6tNVd8xSV2SNEOGGSLzgROBq6rqDcAPefbWFQDtCqKG2AcAkqxOsi3Jtt27dw/7dJJ0yBhmiOwAdlTVLW398/RC5dF2K4r2d1fbvhM4rm//Ja02VX3JJPXnqKp1VTVWVWMLFy6c1qAkSc8aWohU1SPAQ0le1UqnAfcCm4CJGVargOva8ibgvDZL61TgiXbbawuwPMmC9kB9ObClbdub5NQ2K+u8vmNJkmbA/CEf/3eBTyU5DLgfOJ9ecF2b5ALgQeDs1nYzcCYwDvyotaWq9iT5MHBba3dZVe1pyxcBnwAOB25oL0nSDBlqiFTVncDYJJtOm6RtARfv5zjrgfWT1LcBr5lmNyVJHfmJdUlSZ4aIJKkzQ0SS1JkhIknqzBCRJHVmiEiSOjNEJEmdGSKSpM4MEUlSZ4aIJKkzQ0SS1JkhIknqzBCRJHVmiEiSOjNEJEmdGSKSpM4MEUlSZ4aIJKkzQ0SS1JkhIknqzBCRJHVmiEiSOhtqiCR5IMndSe5Msq3Vjk6yNcn29ndBqyfJlUnGk9yV5MS+46xq7bcnWdVXP6kdf7ztm2GOR5L0s2biSuQ3q+r1VTXW1tcAN1bVMuDGtg5wBrCsvVYDV0EvdIBLgVOAk4FLJ4Kntbmwb78Vwx+OJGnCKG5nrQQ2tOUNwFl99Y3VczNwVJJjgdOBrVW1p6oeA7YCK9q2I6vq5qoqYGPfsSRJM2DYIVLAXye5PcnqVltUVQ+35UeARW15MfBQ3747Wm2q+o5J6s+RZHWSbUm27d69ezrjkST1mT/k47+5qnYm+UVga5Jv9W+sqkpSQ+4DVbUOWAcwNjY29PNJ0qFiqFciVbWz/d0FfIneM41H260o2t9drflO4Li+3Ze02lT1JZPUJUkzZGghkuSIJC+bWAaWA98ENgETM6xWAde15U3AeW2W1qnAE+221xZgeZIF7YH6cmBL27Y3yaltVtZ5fceSJM2AYd7OWgR8qc26nQ98uqr+KsltwLVJLgAeBM5u7TcDZwLjwI+A8wGqak+SDwO3tXaXVdWetnwR8AngcOCG9pIkzZChhUhV3Q+8bpL694HTJqkXcPF+jrUeWD9JfRvwmml3VpLUiZ9YlyR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktTZQCGS5J8NuyOSpNln0CuRP0tya5KLkrx8qD2SJM0aA4VIVf068DvAccDtST6d5G1D7Zkk6QVv4GciVbUd+D3gEuCfA1cm+VaSfzmszkmSXtgGfSby2iRXAPcBbwV+q6r+aVu+Yoj9kyS9gM0fsN2fAB8HPlhV/zBRrKrvJfm9ofRMkvSCN+jtrLcDn54IkCQvSvJSgKr65FQ7JpmX5I4kf9nWj09yS5LxJJ9Nclir/1xbH2/bl/Yd4wOt/u0kp/fVV7TaeJI1BzJwSdL0DRoiXwEO71t/aasN4r30boNN+EPgiqr6FeAx4IJWvwB4rNWvaO1IcgJwDvCrwAp6M8XmJZkHfAw4AzgBOLe1lSTNkEFvZ72kqp6cWKmqJyeuRKaSZAm9q5jLgfcnCb3nKP+2NdkAfAi4CljZlgE+D/xpa78SuKaqngK+m2QcOLm1G6+q+9u5rmlt7x1wTHoBW7rm+pGd+4G1bx/ZuaXZZtArkR8mOXFiJclJwD9M0X7CHwP/BfhJW/8F4PGqeqat7wAWt+XFwEMAbfsTrf1P6/vss7+6JGmGDHol8j7gc0m+BwT4J8C/mWqHJP8C2FVVtyd5y7R6OU1JVgOrAV7xileMsiuSNKcMFCJVdVuSVwOvaqVvV9X/e57dfg14R5IzgZcARwIfBY5KMr9dbSwBdrb2O+l9mHFHkvnAy4Hv99Un9O+zv/q+/V8HrAMYGxur5+m3JGlAB/IFjG8EXgucSO8h9nlTNa6qD1TVkqpaSu/B+Fer6neAm4B3tmargOva8qa2Ttv+1aqqVj+nzd46HlgG3ArcBixrs70Oa+fYdADjkSRN00BXIkk+CfwycCfw41YuYGOHc14CXJPkD4A7gKtb/Wrgk+3B+R56oUBV3ZPkWnoPzJ8BLq6qH7d+vQfYAswD1lfVPR36I0nqaNBnImPACe3K4IBV1deAr7Xl+3l2dlV/m38E/vV+9r+c3gyvfeubgc1d+iRJmr5Bb2d9k97DdEmSfmrQK5FjgHuT3Ao8NVGsqncMpVeSpFlh0BD50DA7IUmanQad4vs3SX4JWFZVX2mfVp833K5Jkl7oBv0q+AvpfRXJn7fSYuDLw+qUJGl2GPTB+sX0Pjy4F376A1W/OKxOSZJmh0FD5KmqenpipX2i3E9+S9IhbtAQ+ZskHwQOb7+t/jngfw+vW5Kk2WDQEFkD7AbuBt5N7wN+/qKhJB3iBp2d9RPgL9pLkiRg8O/O+i6TPAOpqlce9B5JkmaNA/nurAkvofcdV0cf/O5IkmaTgZ6JVNX3+147q+qP6f3srSTpEDbo7awT+1ZfRO/KZNCrGEnSHDVoEPxR3/IzwAPA2Qe9N5KkWWXQ2Vm/OeyOSJJmn0FvZ71/qu1V9ZGD0x1J0mxyILOz3sizv2H+W/R+53z7MDoljdLSNdeP5LwPrHWuimafQUNkCXBiVf0AIMmHgOur6l3D6pgk6YVv0K89WQQ83bf+dKtJkg5hg16JbARuTfKltn4WsGE4XZIkzRaDzs66PMkNwK+30vlVdcfwuiVJmg0GvZ0F8FJgb1V9FNiR5PipGid5SZJbk3wjyT1Jfr/Vj09yS5LxJJ9Nclir/1xbH2/bl/Yd6wOt/u0kp/fVV7TaeJI1BzAWSdJBMOjP414KXAJ8oJVeDPyv59ntKeCtVfU64PXAiiSnAn8IXFFVvwI8BlzQ2l8APNbqV7R2JDkBOAf4VWAF8GdJ5iWZB3wMOAM4ATi3tZUkzZBBr0R+G3gH8EOAqvoe8LKpdqieJ9vqi9urgLfS+7126D1XOastr+TZ5yyfB05Lkla/pqqeqqrvAuPAye01XlX3t19dvKa1lSTNkEFD5OmqKtrXwSc5YpCd2hXDncAuYCvwHeDxqnqmNdkBLG7Li4GHANr2J4Bf6K/vs8/+6pKkGTJoiFyb5M+Bo5JcCHyFAX6gqqp+XFWvp/c5k5OBV3fu6TQkWZ1kW5Jtu3fvHkUXJGlOet7ZWe2W0mfpBcBe4FXAf62qrYOepKoeT3IT8CZ6QTS/XW0sAXa2ZjuB4+g9tJ8PvBz4fl99Qv8++6vve/51wDqAsbGx5/y4liSpm+e9Emm3sTZX1daq+s9V9Z8GCZAkC5Mc1ZYPB94G3AfcBLyzNVsFXNeWN7V12vavtnNvAs5ps7eOB5bR+8qV24BlbbbXYfQevk98LYskaQYM+mHDryd5Y1XddgDHPhbY0GZRvQi4tqr+Msm9wDVJ/gC4A7i6tb8a+GSScWAPvVCgqu5Jci1wL72vob+4qn4MkOQ9wBZgHrC+qu45gP5JkqZp0BA5BXhXkgfozdAKvYuU1+5vh6q6C3jDJPX76T0f2bf+j/R+dneyY10OXD5JfTOwebAhSJIOtilDJMkrqur/AqdP1U6SdGh6viuRL9P79t4Hk3yhqv7VTHRKkjQ7PN+D9fQtv3KYHZEkzT7PFyK1n2VJkp73dtbrkuyld0VyeFuGZx+sHznU3kmSXtCmDJGqmjdTHZEkzT4H8lXwkiT9DENEktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktTZoD9KpRFauub6UXdBkibllYgkqTNDRJLUmSEiSerMEJEkdWaISJI6G1qIJDkuyU1J7k1yT5L3tvrRSbYm2d7+Lmj1JLkyyXiSu5Kc2HesVa399iSr+uonJbm77XNlkjy3J5KkYRnmlcgzwH+sqhOAU4GLk5wArAFurKplwI1tHeAMYFl7rQaugl7oAJcCpwAnA5dOBE9rc2HffiuGOB5J0j6GFiJV9XBVfb0t/wC4D1gMrAQ2tGYbgLPa8kpgY/XcDByV5FjgdGBrVe2pqseArcCKtu3Iqrq5qgrY2HcsSdIMmJFnIkmWAm8AbgEWVdXDbdMjwKK2vBh4qG+3Ha02VX3HJPXJzr86ybYk23bv3j2tsUiSnjX0EEny88AXgPdV1d7+be0Koobdh6paV1VjVTW2cOHCYZ9Okg4ZQw2RJC+mFyCfqqovtvKj7VYU7e+uVt8JHNe3+5JWm6q+ZJK6JGmGDHN2VoCrgfuq6iN9mzYBEzOsVgHX9dXPa7O0TgWeaLe9tgDLkyxoD9SXA1vatr1JTm3nOq/vWJKkGTDML2D8NeDfAXcnubPVPgisBa5NcgHwIHB227YZOBMYB34EnA9QVXuSfBi4rbW7rKr2tOWLgE8AhwM3tJckaYYMLUSq6u+A/X1u47RJ2hdw8X6OtR5YP0l9G/CaaXRTkjQNfmJdktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktSZISJJ6swQkSR1ZohIkjozRCRJnQ0tRJKsT7IryTf7akcn2Zpke/u7oNWT5Mok40nuSnJi3z6rWvvtSVb11U9Kcnfb58okGdZYJEmTmz/EY38C+FNgY19tDXBjVa1NsqatXwKcASxrr1OAq4BTkhwNXAqMAQXcnmRTVT3W2lwI3AJsBlYANwxxPNJQLV1z/UjO+8Dat4/kvJobhnYlUlV/C+zZp7wS2NCWNwBn9dU3Vs/NwFFJjgVOB7ZW1Z4WHFuBFW3bkVV1c1UVvaA6C0nSjJrpZyKLqurhtvwIsKgtLwYe6mu3o9Wmqu+YpC5JmkEje7DeriBqJs6VZHWSbUm27d69eyZOKUmHhJkOkUfbrSja312tvhM4rq/dklabqr5kkvqkqmpdVY1V1djChQunPQhJUs9Mh8gmYGKG1Srgur76eW2W1qnAE+221xZgeZIFbSbXcmBL27Y3yaltVtZ5fceSJM2Qoc3OSvIZ4C3AMUl20JtltRa4NskFwIPA2a35ZuBMYBz4EXA+QFXtSfJh4LbW7rKqmnhYfxG9GWCH05uV5cwsSZphQwuRqjp3P5tOm6RtARfv5zjrgfWT1LcBr5lOHyVJ0+Mn1iVJnRkikqTODBFJUmeGiCSpM0NEktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSpM0NEktSZISJJ6swQkSR1ZohIkjozRCRJnRkikqTODBFJUmeGiCSps/mj7oCk0Vq65vqRnfuBtW8f2bl1cHglIknqbNZfiSRZAXwUmAd8vKrWDutco/w/NmkuGtW/Ka+ADp5ZfSWSZB7wMeAM4ATg3CQnjLZXknTomNUhApwMjFfV/VX1NHANsHLEfZKkQ8Zsv521GHiob30HcMqI+iJplnAywcEz20NkIElWA6vb6pNJvj3K/kziGODvR92JIZvrY3R8s9+MjDF/OOwz7Nd0xvdL+9sw20NkJ3Bc3/qSVvsZVbUOWDdTnTpQSbZV1dio+zFMc32Mjm/2m+tjHNb4ZvszkduAZUmOT3IYcA6wacR9kqRDxqy+EqmqZ5K8B9hCb4rv+qq6Z8TdkqRDxqwOEYCq2gxsHnU/pukFe6vtIJrrY3R8s99cH+NQxpeqGsZxJUmHgNn+TESSNEKGyIgleSDJ3UnuTLJt1P05GJKsT7IryTf7akcn2Zpke/u7YJR9nI79jO9DSXa29/HOJGeOso/TkeS4JDcluTfJPUne2+pz4j2cYnxz6T18SZJbk3yjjfH3W/34JLckGU/y2TYhaXrn8nbWaCV5ABirqjkzBz/JbwBPAhur6jWt9t+APVW1NskaYEFVXTLKfna1n/F9CHiyqv77KPt2MCQ5Fji2qr6e5GXA7cBZwL9nDryHU4zvbObOexjgiKp6MsmLgb8D3gu8H/hiVV2T5H8A36iqq6ZzLq9EdNBV1d8Ce/YprwQ2tOUN9P7Rzkr7Gd+cUVUPV9XX2/IPgPvofTvEnHgPpxjfnFE9T7bVF7dXAW8FPt/qB+U9NERGr4C/TnJ7+2T9XLWoqh5uy48Ai0bZmSF5T5K72u2uWXmrZ19JlgJvAG5hDr6H+4wP5tB7mGRekjuBXcBW4DvA41X1TGuyg4MQnobI6L25qk6k903EF7dbJXNa9e6hzrX7qFcBvwy8HngY+KPRdmf6kvw88AXgfVW1t3/bXHgPJxnfnHoPq+rHVfV6et/kcTLw6mGcxxAZsara2f7uAr5E782eix5t96In7knvGnF/DqqqerT9o/0J8BfM8vex3Uf/AvCpqvpiK8+Z93Cy8c2193BCVT0O3AS8CTgqycTnAyf9mqgDZYiMUJIj2oM9khwBLAe+OfVes9YmYFVbXgVcN8K+HHQT/3FtfptZ/D62h7JXA/dV1Uf6Ns2J93B/45tj7+HCJEe15cOBt9F79nMT8M7W7KC8h87OGqEkr6R39QG9bw/4dFVdPsIuHRRJPgO8hd63hj4KXAp8GbgWeAXwIHB2Vc3Kh9P7Gd9b6N0GKeAB4N19zw9mlSRvBv4PcDfwk1b+IL3nBrP+PZxifOcyd97D19J7cD6P3sXCtVV1WftvzjXA0cAdwLuq6qlpncsQkSR15e0sSVJnhogkqTNDRJLUmSEiSerMEJEkdWaISJI6M0QkSZ0ZIpKkzv4/2LyLCkd/AwYAAAAASUVORK5CYII=\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": [],
            "needs_background": "light"
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "iBWX2cIHN_1J"
      },
      "source": [
        "Looking at the distribution of our `\"total_lines\"` column, a value of 20 looks like it covers the majority of samples.\n",
        "\n",
        "We can confirm this with [`np.percentile()`](https://numpy.org/doc/stable/reference/generated/numpy.percentile.html)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "or736pZLNwWn",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "7e716884-695a-4645-f0a9-f199a9e2ff02"
      },
      "source": [
        "# Check the coverage of a \"total_lines\" value of 20\n",
        "np.percentile(train_df.total_lines, 98) # a value of 20 covers 98% of samples"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "20.0"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 115
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dy8Ds74HOLXQ"
      },
      "source": [
        "Beautiful! Plenty of converage. Let's one-hot-encode our `\"total_lines\"` column just as we did our `\"line_number\"` column."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Egqq3LnnN0Z6",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "9719aa05-b344-4785-d16a-212cf9b1c13c"
      },
      "source": [
        "# Use TensorFlow to create one-hot-encoded tensors of our \"total_lines\" column \n",
        "train_total_lines_one_hot = tf.one_hot(train_df[\"total_lines\"].to_numpy(), depth=20)\n",
        "val_total_lines_one_hot = tf.one_hot(val_df[\"total_lines\"].to_numpy(), depth=20)\n",
        "test_total_lines_one_hot = tf.one_hot(test_df[\"total_lines\"].to_numpy(), depth=20)\n",
        "\n",
        "# Check shape and samples of total lines one-hot tensor\n",
        "train_total_lines_one_hot.shape, train_total_lines_one_hot[:10]"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(TensorShape([180040, 20]), <tf.Tensor: shape=(10, 20), dtype=float32, numpy=\n",
              " array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,\n",
              "         0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,\n",
              "         0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,\n",
              "         0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,\n",
              "         0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,\n",
              "         0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,\n",
              "         0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,\n",
              "         0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,\n",
              "         0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,\n",
              "         0., 0., 0., 0.],\n",
              "        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0.,\n",
              "         0., 0., 0., 0.]], dtype=float32)>)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 116
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JVJWCANtQMiJ"
      },
      "source": [
        "### Building a tribrid embedding model\n",
        "\n",
        "Woohoo! Positional embedding tensors ready.\n",
        "\n",
        "It's time to build the biggest model we've built yet. One which incorporates token embeddings, character embeddings and our newly crafted positional embeddings.\n",
        "\n",
        "We'll be venturing into uncovered territory but there will be nothing here you haven't practiced before.\n",
        "\n",
        "More specifically we're going to go through the following steps:\n",
        "\n",
        "1. Create a token-level model (similar to `model_1`)\n",
        "2. Create a character-level model (similar to `model_3` with a slight modification to reflect the paper)\n",
        "3. Create a `\"line_number\"` model (takes in one-hot-encoded `\"line_number\"` tensor and passes it through a non-linear layer)\n",
        "4. Create a `\"total_lines\"` model (takes in one-hot-encoded `\"total_lines\"` tensor and passes it through a non-linear layer)\n",
        "5. Combine (using [`layers.Concatenate`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Concatenate)) the outputs of 1 and 2 into a token-character-hybrid embedding and pass it series of output to Figure 1 and section 4.2 of [*Neural Networks for Joint Sentence Classification\n",
        "in Medical Paper Abstracts*](https://arxiv.org/pdf/1612.05251.pdf)\n",
        "6. Combine (using [`layers.Concatenate`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Concatenate)) the outputs of 3, 4 and 5 into a token-character-positional tribrid embedding \n",
        "7. Create an output layer to accept the tribrid embedding and output predicted label probabilities\n",
        "8. Combine the inputs of 1, 2, 3, 4 and outputs of 7 into a [`tf.keras.Model`](https://www.tensorflow.org/api_docs/python/tf/keras/Model)\n",
        "\n",
        "Woah! That's alot... but nothing we're not capable of. Let's code it."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "aPiFnY8E0oPS"
      },
      "source": [
        "# 1. Token inputs\n",
        "token_inputs = layers.Input(shape=[], dtype=\"string\", name=\"token_inputs\")\n",
        "token_embeddings = tf_hub_embedding_layer(token_inputs)\n",
        "token_outputs = layers.Dense(128, activation=\"relu\")(token_embeddings)\n",
        "token_model = tf.keras.Model(inputs=token_inputs,\n",
        "                             outputs=token_embeddings)\n",
        "\n",
        "# 2. Char inputs\n",
        "char_inputs = layers.Input(shape=(1,), dtype=\"string\", name=\"char_inputs\")\n",
        "char_vectors = char_vectorizer(char_inputs)\n",
        "char_embeddings = char_embed(char_vectors)\n",
        "char_bi_lstm = layers.Bidirectional(layers.LSTM(32))(char_embeddings)\n",
        "char_model = tf.keras.Model(inputs=char_inputs,\n",
        "                            outputs=char_bi_lstm)\n",
        "\n",
        "# 3. Line numbers inputs\n",
        "line_number_inputs = layers.Input(shape=(15,), dtype=tf.int32, name=\"line_number_input\")\n",
        "x = layers.Dense(32, activation=\"relu\")(line_number_inputs)\n",
        "line_number_model = tf.keras.Model(inputs=line_number_inputs,\n",
        "                                   outputs=x)\n",
        "\n",
        "# 4. Total lines inputs\n",
        "total_lines_inputs = layers.Input(shape=(20,), dtype=tf.int32, name=\"total_lines_input\")\n",
        "y = layers.Dense(32, activation=\"relu\")(total_lines_inputs)\n",
        "total_line_model = tf.keras.Model(inputs=total_lines_inputs,\n",
        "                                  outputs=y)\n",
        "\n",
        "# 5. Combine token and char embeddings into a hybrid embedding\n",
        "combined_embeddings = layers.Concatenate(name=\"token_char_hybrid_embedding\")([token_model.output, \n",
        "                                                                              char_model.output])\n",
        "z = layers.Dense(256, activation=\"relu\")(combined_embeddings)\n",
        "z = layers.Dropout(0.5)(z)\n",
        "\n",
        "# 6. Combine positional embeddings with combined token and char embeddings into a tribrid embedding\n",
        "z = layers.Concatenate(name=\"token_char_positional_embedding\")([line_number_model.output,\n",
        "                                                                total_line_model.output,\n",
        "                                                                z])\n",
        "\n",
        "# 7. Create output layer\n",
        "output_layer = layers.Dense(5, activation=\"softmax\", name=\"output_layer\")(z)\n",
        "\n",
        "# 8. Put together model\n",
        "model_5 = tf.keras.Model(inputs=[line_number_model.input,\n",
        "                                 total_line_model.input,\n",
        "                                 token_model.input, \n",
        "                                 char_model.input],\n",
        "                         outputs=output_layer)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KVhTeiveWf4X"
      },
      "source": [
        "There's a lot going on here... let's visualize what's happening with a summary by plotting our model."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "n7eJOhlKfVQJ",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "359601b0-4191-4cfa-f71e-e195a5ae0fa1"
      },
      "source": [
        "# Get a summary of our token, char and positional embedding model\n",
        "model_5.summary()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Model: \"model_18\"\n",
            "__________________________________________________________________________________________________\n",
            "Layer (type)                    Output Shape         Param #     Connected to                     \n",
            "==================================================================================================\n",
            "char_inputs (InputLayer)        [(None, 1)]          0                                            \n",
            "__________________________________________________________________________________________________\n",
            "char_vectorizer (TextVectorizat (None, 290)          0           char_inputs[0][0]                \n",
            "__________________________________________________________________________________________________\n",
            "token_inputs (InputLayer)       [(None,)]            0                                            \n",
            "__________________________________________________________________________________________________\n",
            "char_embed (Embedding)          (None, 290, 25)      1750        char_vectorizer[4][0]            \n",
            "__________________________________________________________________________________________________\n",
            "universal_sentence_encoder (Ker (None, 512)          256797824   token_inputs[0][0]               \n",
            "__________________________________________________________________________________________________\n",
            "bidirectional_3 (Bidirectional) (None, 64)           14848       char_embed[4][0]                 \n",
            "__________________________________________________________________________________________________\n",
            "token_char_hybrid_embedding (Co (None, 576)          0           universal_sentence_encoder[4][0] \n",
            "                                                                 bidirectional_3[0][0]            \n",
            "__________________________________________________________________________________________________\n",
            "line_number_input (InputLayer)  [(None, 15)]         0                                            \n",
            "__________________________________________________________________________________________________\n",
            "total_lines_input (InputLayer)  [(None, 20)]         0                                            \n",
            "__________________________________________________________________________________________________\n",
            "dense_18 (Dense)                (None, 256)          147712      token_char_hybrid_embedding[0][0]\n",
            "__________________________________________________________________________________________________\n",
            "dense_16 (Dense)                (None, 32)           512         line_number_input[0][0]          \n",
            "__________________________________________________________________________________________________\n",
            "dense_17 (Dense)                (None, 32)           672         total_lines_input[0][0]          \n",
            "__________________________________________________________________________________________________\n",
            "dropout_4 (Dropout)             (None, 256)          0           dense_18[0][0]                   \n",
            "__________________________________________________________________________________________________\n",
            "token_char_positional_embedding (None, 320)          0           dense_16[0][0]                   \n",
            "                                                                 dense_17[0][0]                   \n",
            "                                                                 dropout_4[0][0]                  \n",
            "__________________________________________________________________________________________________\n",
            "output_layer (Dense)            (None, 5)            1605        token_char_positional_embedding[0\n",
            "==================================================================================================\n",
            "Total params: 256,964,923\n",
            "Trainable params: 167,099\n",
            "Non-trainable params: 256,797,824\n",
            "__________________________________________________________________________________________________\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "uM0dohpZ_v5U",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 856
        },
        "outputId": "ae34b5d4-e67a-4f69-ed3b-39d0f637e46d"
      },
      "source": [
        "# Plot the token, char, positional embedding model\n",
        "from tensorflow.keras.utils import plot_model\n",
        "plot_model(model_5)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAA/sAAANHCAYAAABpcE00AAAABmJLR0QA/wD/AP+gvaeTAAAgAElEQVR4nOzdaXxU5f3///dknUwgk7AGCCEQVEDRKmApYhWtClgRCEIEVKgouAEWFUWlKKCiFvjK4lIRf0KLScCigrt1rYgrRVEQsYLIErYkQIAsfP43/DN1IMskmWSSk9fz8ciNnHPmOp/rzDVX5p0554zLzEwAAAAAAMApssJCXQEAAAAAAAguwj4AAAAAAA5D2AcAAAAAwGEI+wAAAAAAOExEqAtwuiuuuCLUJQB1VlZWVqhLAAAAAOokF3fjr14ul0vdu3dXUlJSqEsB6oytW7fq448/FtMTAAAAUClZfLJfA2699VYNHjw41GUAdUZmZqaGDBkS6jIAAACAOotr9gEAAAAAcBjCPgAAAAAADkPYBwAAAADAYQj7AAAAAAA4DGEfAAAAAACHIewDAAAAAOAwhH0AAAAAAByGsA8AAAAAgMMQ9gEAAAAAcBjCPgAAAAAADkPYBwAAAADAYQj7AAAAAAA4DGEfAAAAAACHIewDAAAAAOAwhP06ZOTIkXK73XK5XDp8+HBIa3nllVfk9Xr18ssvh7SO6vLxxx+rY8eOCgsLk8vlUvPmzTVt2rRQl+Vn2bJlateunVwul1wulxITEzV8+PBQlwUAAACgFogIdQEI3MKFC9WqVStNnz491KXIzEJdQrXq3r27vv32W/Xu3Vuvv/66NmzYoPj4+FCX5SctLU1paWlq3769du/erR07doS6JAAAAAC1BJ/so1IuvfRS5ebm6rLLLgt1KTp06JB69OgR6jKqXX3pJwAAAICqI+zXUS6XK9Ql1BoLFixQdnZ2qMuodvWlnwAAAACqjrBfCy1atEhdu3aV2+1WbGysUlJSNHXqVN/6sLAwrVy5Un369JHX61WLFi30zDPP+LXxwQcfqFOnTvJ6vXK73ercubNef/11SdLDDz8sj8ejhg0bKjs7WxMmTFCrVq20YcOGgOr78MMPlZycLJfLpblz50qS5s+fr9jYWHk8Hr344ovq06eP4uLilJSUpCVLlvge+9hjj8ntdqtZs2YaM2aMWrRoIbfbrR49emj16tW+7caOHauoqCglJib6lt10002KjY2Vy+XS7t27JUnjx4/XhAkTtGnTJrlcLrVv316S9N577+nss8+Wx+NRXFycOnfurLy8PEnSa6+9pri4uEpdDlHb+llRZY2LUaNG+a7/T01N1Zdffinpl3tFeDweeb1evfTSS5Kk4uJiTZ48WcnJyYqJidHpp5+ujIwMSVUfXwAAAACCwFCtJFlGRkbA28+aNcsk2YMPPmh79uyxvXv32pNPPmnDhg0zM7O7777bJNnbb79tOTk5tnfvXuvbt69FR0fbwYMHfe1kZWXZlClTbO/evbZnzx7r3r27NW7c2Lf+WDvjxo2zOXPm2MCBA+3bb78NuM6ffvrJJNmcOXNOaPPtt9+23Nxcy87OtnPPPddiY2OtoKDAt93o0aMtNjbWvvnmGzt8+LCtW7fOunXrZg0bNrQtW7b4ths2bJg1b97cb7+PPPKISbJdu3b5lqWlpVlqaqrv9wMHDlhcXJzNmDHDDh06ZDt27LCBAwf6HrNixQpr2LCh3X///eX285JLLjFJtm/fvlrXz2NSU1PN6/WW2xez8sdFWlqahYeH288//+z3uKFDh9pLL73k+/22226z6OhoW7p0qe3bt88mTZpkYWFh9umnn/odo8qOr4yMDGN6AgAAACotk0/2a5HCwkLdd9996tWrl+688041atRICQkJuvbaa9WtWze/bXv06CGv16uEhASlp6fryJEj+u9//+tbP2jQIP3lL39RQkKCGjVqpH79+mnPnj3atWuXXzsPPfSQbr75Zi1btkwdOnQISj969OihuLg4NW3aVOnp6Tp48KC2bNnit01ERIQ6duyo6OhoderUSfPnz9f+/fu1cOHCKu//xx9/VF5enk499VS53W41b95cy5YtU5MmTST9cr+BvLw83XvvvVXaT6j7WRnljYsbbrhBxcXFfvXl5eXp008/Vd++fSVJhw8f1vz58zVgwAClpaUpPj5e99xzjyIjI0/oV3WMLwAAAADlI+zXImvXrlVOTo4uueQSv+Xh4eEaN25cqY+LjIyU9Ms/C8rbpri4OAiVBi4qKkpS2bVJUteuXeXxeLR+/foq77Ndu3Zq1qyZhg8frilTpujHH3+scpvlCUU/g+H4cXHBBRfo5JNP1jPPPOP7xoXnn39e6enpCg8PlyRt2LBB+fn5Ou2003ztxMTEKDExsdb0CwAAAKjvCPu1yLFryoPxFW8rV67U+eefr6ZNmyo6Olp33HFHldusbtHR0SeceVAZMTEx+te//qWePXtq+vTpateundLT03Xo0KEgVFl1wepnZZQ3Llwul8aMGaMffvhBb7/9tiTpueee07XXXuvb5uDBg5Kke+65x3eNv8vl0ubNm5Wfn19znQEAAABQKsJ+LdKyZUtJ8t2UrbK2bNmiAQMGKDExUatXr1Zubq5mzJgRjBKrTWFhoXJycpSUlBSU9k499VS9/PLL2rZtmyZOnKiMjAw9+uijQWm7KoLdz/K8//77mjVrlqTAx8WIESPkdrv19NNPa8OGDYqLi1ObNm1865s2bSpJmjVrlszM72fVqlU10i8AAAAAZSPs1yIpKSlq1KiR3njjjSq189VXX6mwsFA33nij2rVrJ7fbXeu/qu/dd9+Vmal79+6+ZREREeWeFl+Sbdu26ZtvvpH0SzB98MEHddZZZ/mWhVIw+xmIzz//XLGxsZICHxcJCQkaMmSIli9frkcffVTXXXed3/rWrVvL7XZrzZo11VIzAAAAgKoj7Nci0dHRmjRpkt5//32NHTtWP//8s44ePar9+/dXKKgmJydLkt566y0dPnxYGzdu9Pu6t9rg6NGj2rdvn4qKirR27VqNHz9eycnJGjFihG+b9u3ba+/evVq+fLkKCwu1a9cubd68+YS2GjVqpG3btunHH3/U/v37tXnzZo0ZM0br169XQUGBvvzyS23evNkXsF999dVKf/VebepnWf8gKCws1M6dO/Xuu+/6wn5FxsUNN9ygI0eOaMWKFbrsssv81rndbo0cOVJLlizR/PnzlZeXp+LiYm3dulXbt2+v6CECAAAAUB1C+FUA9YIq+NV7ZmZz5861zp07m9vtNrfbbWeeeabNmzfPZsyYYTExMSbJTjrpJNu0aZMtXrzYEhISTJIlJSXZ119/bWZmEydOtEaNGll8fLxdccUVNnfuXJNkqampdvPNN/vaad26tS1atKhC9c2ZM8cSExNNknk8HuvXr5/NmzfPPB6PX21PPfWUxcXFmSRr06aNfffdd2b2y1fSRUZGWqtWrSwiIsLi4uKsf//+tmnTJr/97Nmzx3r16mVut9vatm1rt9xyi91+++0mydq3b+/7+rovvvjC2rRpYzExMdazZ09bvXq19ejRwxISEiw8PNxatmxpd999txUVFZmZ2SuvvGINGza0adOmldrHjz/+2E499VQLCwszSZaYmGjTp0+vVf18/PHHLTU11SSV+fPCCy/49lXWuPj11wGamZ155pl21113lXh8jhw5YhMnTrTk5GSLiIiwpk2bWlpamq1bt85vnFZmfJnx1XsAAABAFWW6zP7/W26jWrhcLmVkZGjw4MGhLqXWGDNmjLKysrRnz55Ql1Kt6no/L730Us2dO1dt27at8X1nZmZqyJAhYnoCAAAAKiWL0/gREjX9FYChUpf6+evLAtauXSu32x2SoA8AAACg6gj7kCStX7/e72vUSvtJT08PdamoJhMnTtTGjRv13XffaeTIkZo6dWqoSwIAAABQSYR9SJI6dOhwwteolfTz/PPPV2k/kyZN0sKFC5Wbm6u2bdtq6dKlQepB7VIX++nxeNShQwf94Q9/0JQpU9SpU6dQlwQAAACgkrhmv5pxzT5QcVyzDwAAAFQJ1+wDAAAAAOA0hH0AAAAAAByGsA8AAAAAgMMQ9gEAAAAAcBjCPgAAAAAADkPYBwAAAADAYQj7AAAAAAA4DGEfAAAAAACHIewDAAAAAOAwhH0AAAAAAByGsA8AAAAAgMMQ9gEAAAAAcBjCPgAAAAAADuMyMwt1EU7mcrnUvXt3JSUlhboUoM7YunWrPv74YzE9AQAAAJWSFRHqCpxu0KBBoS4Bpfjss88kSV27dg1xJTheUlISrx0AAACgCvhkH/XW4MGDJUmZmZkhrgQAAAAAgiqLa/YBAAAAAHAYwj4AAAAAAA5D2AcAAAAAwGEI+wAAAAAAOAxhHwAAAAAAhyHsAwAAAADgMIR9AAAAAAAchrAPAAAAAIDDEPYBAAAAAHAYwj4AAAAAAA5D2AcAAAAAwGEI+wAAAAAAOAxhHwAAAAAAhyHsAwAAAADgMIR9AAAAAAAchrAPAAAAAIDDEPYBAAAAAHAYwj4AAAAAAA5D2AcAAAAAwGEI+wAAAAAAOAxhHwAAAAAAhyHsAwAAAADgMIR9AAAAAAAchrAPAAAAAIDDEPYBAAAAAHAYwj4AAAAAAA5D2AcAAAAAwGEI+wAAAAAAOAxhHwAAAAAAhyHsAwAAAADgMIR9AAAAAAAchrAPAAAAAIDDEPYBAAAAAHCYiFAXANSEZ599VrNnz1ZxcbFv2a5duyRJnTt39i0LDw/X+PHjNWLEiJouEQAAAACCxmVmFuoigOq2YcMGdejQIaBtv/3224C3BQAAAIBaKIvT+FEvnHLKKercubNcLlep27hcLnXu3JmgDwAAAKDOI+yj3rj66qsVHh5e6vqIiAhdc801NVgRAAAAAFQPTuNHvbFt2zYlJSWptCHvcrm0ZcsWJSUl1XBlAAAAABBUnMaP+qNly5bq0aOHwsJOHPZhYWHq0aMHQR8AAACAIxD2Ua9cddVVJV6373K5dPXVV4egIgAAAAAIPk7jR72yd+9eNW/eXEVFRX7Lw8PDtXPnTjVu3DhElQEAAABA0HAaP+qXRo0a6aKLLlJERIRvWXh4uC666CKCPgAAAADHIOyj3hk+fLiOHj3q+93MdNVVV4WwIgAAAAAILk7jR71z8OBBNWnSRIcPH5YkRUdHa/fu3WrQoEGIKwMAAACAoOA0ftQ/sbGx6tevnyIjIxUREaH+/fsT9AEAAAA4CmEf9dKwYcNUVFSk4uJiDR06NNTlAAAAAEBQRZS/Sd2QmZkZ6hJQhxQXF8vtdsvMdODAAcYPKmTw4MGhLgEAAAAok2Ou2S/pu9MBoDo4ZNoEAACAc2U55pN9ScrIyOATNwTsnXfekcvl0vnnnx/qUlBHZGZmasiQIaEuAwAAACiXo8I+UBHnnXdeqEsAAAAAgGpB2Ee9FRbG/SkBAAAAOBNpBwAAAAAAhyHsAwAAAADgMIR9AAAAAAAchrAPAAAAAIDDEPYBAAAAAHAYwj4AAAAAAA5D2AcAAAAAwGEI+wAAAAAAOAxhHwAAAAAAhyHsAwAAAADgMIR9AAAAAAAchrAPAAAAAIDDEPZLMHLkSLndbrlcLh0+fDjU5dRZr7zyirxer15++eVQlyJJSk9Pl8vlCuhnxYoV1VbH6NGjFRsbK5fLpcjISJ1xxhn69ttv/bZ55plnlJycLJfLpebNm+vZZ5+ttnoqq6ae39o2jgAAAIC6gLBfgoULF+q2224LdRl1npmFuoQTvPHGG8rJyVFhYaG2b98uSerXr58KCgp08OBBZWdn67rrrqvWGp588kmtWrVKktSlSxf95z//UceOHf22+dOf/qQPPvhALVu21NatWzVixIhqrakyaur5rY3jCAAAAKjtCPuQJB06dEg9evQIapuXXnqpcnNzddlllwW13cpyuVw655xz5PV6FRER4bc8MjJSHo9HTZs2VZcuXYK635KO7emnn66ePXtq9erV+uKLL0p83BNPPKE//elPioyMrJYaqqo6nt+S6qxt4wgAAACoCwj75XC5XKEuoUYsWLBA2dnZoS6jWi1ZskQej6fc7UaPHq0//vGPQdtvacf25ptvliTNmzfvhHUFBQV67rnnNHr06GqtobapK3UCAAAAtV29DvuLFi1S165d5Xa7FRsbq5SUFE2dOtW3PiwsTCtXrlSfPn3k9XrVokULPfPMM35tfPDBB+rUqZO8Xq/cbrc6d+6s119/XZL08MMPy+PxqGHDhsrOztaECRPUqlUrbdiwIaD6OnbsKJfLpbCwMHXp0kX5+fmSpDvuuMO3v2PXchcXF2vy5MlKTk5WTEyMTj/9dGVkZATU3/Hjx2vChAnatGmTXC6X2rdvL+mX06dnzpypjh07Kjo6WgkJCerfv7/Wr1/va7O0Pi5YsMB3zfncuXMlSd9//32p18i/+eab5fajrOP52muvKS4uTtOnTw/o2AairFqeffZZNWjQQC6XSwkJCVq+fLk+++wztWnTRuHh4Ro6dKgklXpsJSktLU0tW7bU888/r5ycHL99L126VL/97W+VlJTkqOe3rNdLSXV++OGHJ+wn0Nrnz5+v2NhYeTwevfjii+rTp4/i4uKUlJSkJUuWVGFkAAAAAHWAOYQky8jICHj7WbNmmSR78MEHbc+ePbZ371578sknbdiwYWZmdvfdd5ske/vtty0nJ8f27t1rffv2tejoaDt48KCvnaysLJsyZYrt3bvX9uzZY927d7fGjRv71h9rZ9y4cTZnzhwbOHCgffvttwHVWFRUZCkpKZacnGxFRUV+62699VabNWuW7/fbbrvNoqOjbenSpbZv3z6bNGmShYWF2aeffhpQf9PS0iw1NdVvH5MnT7aoqChbtGiR5eTk2Nq1a+2ss86yJk2a2I4dO8rt408//WSSbM6cOWZmtnHjRrvzzjt9x2/79u2WkJBgPXr0sOLi4oD6Udq+VqxYYQ0bNrT7778/oGN7bP+S7PLLLy9xfXm1fPPNN+bxeOyaa67xPeauu+6yp59+2q+dko7tMVOmTDFJNnPmTL/lPXv2tLfeeivgWurK81ve66WkOo/fT2Vqf/vtty03N9eys7Pt3HPPtdjYWCsoKCjxOSlLRkaGOWjaBAAAgHNlOuZda0XCfkFBgcXHx1uvXr38lhcVFdns2bPN7H8h4dChQ771zz33nEmyr7/+utS2H3jgAZNk2dnZpbZTEcdCXGZmpm/ZwYMHLTk52XJzc83M7NChQ+bxeCw9Pd23TX5+vkVHR9uNN94YUH+PD1n5+fnWoEEDvzbNzD755BOT5BeqS+tjSSHt1wYMGGBut9vWr18fUD/K2ldllBX2A6nFzOzJJ580SbZ48WL7xz/+YX/+859PaKussL99+3aLjIy0k08+2Y4ePWpmZmvXrrUOHToEXEtdeX5LcvzrJZCwX9Xa582bZ5Ls+++/L7Wu0hD2AQAAUEdk1svT+NeuXaucnBxdcsklfsvDw8M1bty4Uh937EZphYWF5W5TXFwchEqlUaNGyev1avbs2b5lixcvVv/+/RUXFydJ2rBhg/Lz83Xaaaf5tomJiVFiYqLWr19fqf6uW7dOBw4cUNeuXf2Wd+vWTVFRUVq9enWV+pWZmal//vOfuu+++3TKKacE1I+aFGgt119/vQYNGqQxY8YoMzNTDz/8cIX2k5iYqLS0NH333Xd66623JEmPP/64brjhhoBrqSvPb0kq83qpau1RUVGSyn4dAwAAAHVdvQz7eXl5kqT4+Pgqt7Vy5Uqdf/75atq0qaKjo3XHHXdUuc1fa9Cgga6//np99NFH+uSTTyT9EgbHjh3r2+bgwYOSpHvuucfvOunNmzcrPz+/Uv09dg15gwYNTlgXHx+v/fv3V7pPe/bs0S233KJu3bppwoQJAfejJlWklunTp+vAgQOVvrHcsRv1zZ8/X/v379c///lPXXPNNQHXUleeXyk4r5fqrB0AAABwinoZ9lu2bClJ2r17d5Xa2bJliwYMGKDExEStXr1aubm5mjFjRjBK9DN27FhFRkZq1qxZev/999W6dWulpqb61jdt2lSSNGvWLJmZ38+qVasq1d9jwbGk4JSTk+O7cVxljBs3Tjk5OVq4cKHCw8MD7kdNCrSWwsJCjRs3TjNnztSqVas0bdq0Cu/rnHPO0ZlnnqmXX35ZDz74oC6//HJ5vd6Aa6krz2+wXi/VWTsAAADgFPUy7KekpKhRo0Z64403qtTOV199pcLCQt14441q166d3G53tXxVX1JSkgYPHqylS5fq3nvv1fjx4/3Wt27dWm63W2vWrCnx8ZXp72mnnaYGDRros88+81u+evVqFRQUVPq76FeuXKm///3vuvfee3Xqqaf6lt9+++3l9qMmBVrLLbfcouuuu0633nqr/vznP2vq1KmV+sfETTfdpOLiYj300EO68cYbK1RLXXl+g/V6qa7aAQAAACepl2E/OjpakyZN0vvvv6+xY8fq559/1tGjR7V//3598803AbeTnJwsSXrrrbd0+PBhbdy4scrXOpdmwoQJKioq0r59+3TBBRf4rXO73Ro5cqSWLFmi+fPnKy8vT8XFxdq6dau2b98eUH8bNWqkbdu26ccff9T+/fsVHh6uCRMm6IUXXtDixYuVl5enr776SjfccINatGhRqe9/z8vL05gxY/Sb3/xGd955pyTp8OHD+uyzz7RmzZpy+1GWV199NahfvRdILfPmzVOrVq00cOBASdIDDzygTp06adiwYb5T66UTj21J14oPHTpUjRo10jnnnKPTTz+9QrXUlec3kNdLIMfK7XYHvXYAAADAcWr+poDVQxX86j0zs7lz51rnzp3N7Xab2+22M8880+bNm2czZsywmJgYk2QnnXSSbdq0yRYvXmwJCQkmyZKSknx35J84caI1atTI4uPj7YorrrC5c+eaJEtNTbWbb77Z107r1q1t0aJFVepjr169Tvhat2OOHDliEydOtOTkZIuIiLCmTZtaWlqarVu3rtz+mpl98cUX1qZNG4uJibGePXvajh077OjRo/bII4/YSSedZJGRkZaQkGADBgywDRs2+Nr89bH6dR/nzJljiYmJJsk8Ho/169fPHn30UZNU4k/fvn3L7Udp+zIze+WVV6xhw4Y2bdq0co9jXl6e/f73v7dGjRqZJAsLC7P27dvb9OnTAz6ml112mblcLmvUqJF99NFHZvbL1yGGhYWZJPN6vfbZZ5+VemxLcvvtt9s//vEPRz+/Zb1etmzZckKd99xzzwn7MbOAap83b555PB6/1/FTTz1lcXFxJsnatGlj3333Xbnj5de4Gz8AAADqiEyXmVmN/FehmrlcLmVkZGjw4MGhLgWAQ2VmZmrIkCFyyLQJAAAA58qql6fxAwAAAADgZIT9GrZ+/Xq/r08r7Sc9PT3UpQIAAAAA6qiIUBdQ33To0IFTgAEAAAAA1YpP9gEAAAAAcBjCPgAAAAAADkPYBwAAAADAYQj7AAAAAAA4DGEfAAAAAACHIewDAAAAAOAwhH0AAAAAAByGsA8AAAAAgMMQ9gEAAAAAcBjCPgAAAAAADkPYBwAAAADAYQj7AAAAAAA4DGEfAAAAAACHIewDAAAAAOAwEaEuIJhWrVoV6hIAOBhzDAAAAOoKl5lZqIsIBpfLFeoSANQTDpk2AQAA4FxZjvlknzffqKjBgwdLkjIzM0NcCQAAAAAEF9fsAwAAAADgMIR9AAAAAAAchrAPAAAAAIDDEPYBAAAAAHAYwj4AAAAAAA5D2AcAAAAAwGEI+wAAAAAAOAxhHwAAAAAAhyHsAwAAAADgMIR9AAAAAAAchrAPAAAAAIDDEPYBAAAAAHAYwj4AAAAAAA5D2AcAAAAAwGEI+wAAAAAAOAxhHwAAAAAAhyHsAwAAAADgMIR9AAAAAAAchrAPAAAAAIDDEPYBAAAAAHAYwj4AAAAAAA5D2AcAAAAAwGEI+wAAAAAAOAxhHwAAAAAAhyHsAwAAAADgMIR9AAAAAAAchrAPAAAAAIDDEPYBAAAAAHAYwj4AAAAAAA5D2AcAAAAAwGEI+wAAAAAAOAxhHwAAAAAAhyHsAwAAAADgMBGhLgCoCe+9954+/vhjv2Xr16+XJM2YMcNveffu3XXeeefVWG0AAAAAEGwuM7NQFwFUtzfffFMXX3yxIiMjFRZW8gktR48eVWFhod544w1ddNFFNVwhAAAAAARNFmEf9UJxcbGaN2+uPXv2lLldQkKCsrOzFRHBSS8AAAAA6qwsrtlHvRAeHq5hw4YpKiqq1G2ioqJ01VVXEfQBAAAA1HmEfdQbV155pQoKCkpdX1BQoCuvvLIGKwIAAACA6sFp/KhX2rRpoy1btpS4LikpSVu2bJHL5arhqgAAAAAgqDiNH/XL8OHDFRkZecLyqKgoXXPNNQR9AAAAAI5A2Ee9Mnz4cBUWFp6wvKCgQOnp6SGoCAAAAACCj7CPeqVjx47q2LHjCcs7dOig0047LQQVAQAAAEDwEfZR71x99dV+p/JHRkbqmmuuCWFFAAAAABBc3KAP9c6WLVuUkpKiY0Pf5XLphx9+UEpKSmgLAwAAAIDg4AZ9qH+Sk5PVtWtXhYWFyeVyqVu3bgR9AAAAAI5C2Ee9dPXVVyssLEzh4eG66qqrQl0OAAAAAAQVp/GjXtq1a5datGghSfr555/VvHnzEFcEAAAAAEGTFRHqCpyG72mvexITE0NdAgLE/yYRSszvQPUbNGiQsrKyQl0GADgCYb8ajB8/Xr/73e9CXQbK8d5778nlcun3v/99qEtBOVatWqXZs2eHugyA+R2oRrNmzQp1CQDgKIT9avC73/1OgwcPDnUZKEfv3r0lSXFxcSGuBIEg7KM2YH4Hqg+f6ANAcBH2UW8R8gEAAAA4FXfjBwAAAADAYQj7AAAAAAA4DGEfAAAAAACHIewDAAAAAOAwhH0AAAAAAByGsA8AAAAAgMMQ9gEAAAAAcBjCPgAAAAAADkPYBwAAAADAYQj7AAAAAAA4DGEfAAAAAACHIewDAAAAAOAwhH0AAAAAAByGsF+LjRw5UrJGmWwAACAASURBVG63Wy6XS4cPHw51OdWiW7duCg8P129+85ugtz1q1Cg1bNhQLpdLa9asqdBjly1bpnbt2snlcpX6k5KSEpQ6a8MxKG27V155RV6vVy+//HLQawNQsgcffFBer7dSc1d1c/qc8PHHH6tjx44KCwuTy+VS8+bNNW3atFCX5ef4v0+JiYkaPnx4qMsCANRChP1abOHChbrttttCXUa1+vTTT9WrV69qafvpp5/W3/72t0o9Ni0tTT/88INSU1Pl9XplZjIzFRUVKT8/Xzt37pTH4wlKnbXhGJS2nZlVR1kAynDXXXfpySefDHUZJXL6nNC9e3d9++23uvjiiyVJGzZs0D333BPiqvwd//dpx44dWrx4cajLAgDUQoR91AoulyvUJQQkPDxcMTExatasmU4++eSgtl0bj8Gll16q3NxcXXbZZaEuBahzDh06pB49eoS6jKCqTXOCE49vSepLPwEAwUfYryNqYxAMpsjIyGpptzqP2/Lly4PaXqiPQU2MMTNTVlaWnnrqqWrfFxBqCxYsUHZ2dqjLcKz6cnzrSz8BAMFH2K8FFi1apK5du8rtdis2NlYpKSmaOnWqb31YWJhWrlypPn36yOv1qkWLFnrmmWf82vjggw/UqVMneb1eud1ude7cWa+//rok6eGHH5bH41HDhg2VnZ2tCRMmqFWrVtqwYUPANRYXF2vy5MlKTk5WTEyMTj/9dGVkZEiSZs+erdjYWIWFhalLly5q3ry5IiMjFRsbq7POOkvnnnuuWrduLbfbrfj4eN1xxx0ntP/999+rQ4cOio2NVUxMjM4991x9+OGHAdcg/RIkH3nkEZ1yyimKjo6W1+vV7bfffsK+XnvtNcXFxWn69OkB9788de0YBLLdhx9+qOTkZLlcLs2dO1eSNH/+fMXGxsrj8ejFF19Unz59FBcXp6SkJC1ZsuSEWh944AGdcsopiomJUZMmTdS2bVs98MADGjx4cKWPNVAXjB8/XhMmTNCmTZvkcrnUvn17Sb+89mbOnKmOHTsqOjpaCQkJ6t+/v9avX19mezt37lRKSooiIiLUu3dv3/Ky5oSKvF4DUZU54bHHHpPb7VazZs00ZswYtWjRQm63Wz169NDq1at9240dO1ZRUVFKTEz0LbvpppsUGxsrl8ul3bt3l3l833vvPZ199tnyeDyKi4tT586dlZeXJ6lqc39t62dFlfUeYdSoUb7r/1NTU/Xll19K+uW+QR6PR16vVy+99JKkssdbMN5rAACCzBBUkiwjIyPg7WfNmmWS7MEHH7Q9e/bY3r177cknn7Rhw4aZmdndd99tkuztt9+2nJwc27t3r/Xt29eio6Pt4MGDvnaysrJsypQptnfvXtuzZ491797dGjdu7Ft/rJ1x48bZnDlzbODAgfbtt98GXOdtt91m0dHRtnTpUtu3b59NmjTJwsLC7NNPPzUzs7/85S8myVavXm0HDx603bt3W+/evU2SrVy50nbt2mUHDx60sWPHmiRbs2aNr+0LL7zQ2rVrZ//973+tsLDQvv76a/vtb39rbrfbvvvuu4BruPvuu83lctlf//pX27dvn+Xn59u8efNMkn355Ze+dlasWGENGza0+++/v9x+p6ammtfr9Vs2btw4++qrr07Yti4dg0C3++mnn0ySzZkzx++xx8Zkbm6uZWdn27nnnmuxsbFWUFDg22769OkWHh5uL774ouXn59vnn39uzZs3t/PPP7/c4368jIwMY7pCqFV0fk9LS7PU1FS/ZZMnT7aoqChbtGiR5eTk2Nq1a+2ss86yJk2a2I4dO3zbLVmyxO/1WFBQYGlpafbiiy/6tRfInBDI6zVQVZkTRo8ebbGxsfbNN9/Y4cOHbd26ddatWzdr2LChbdmyxbfdsGHDrHnz5n77feSRR0yS7dq1q9Tje+DAAYuLi7MZM2bYoUOHbMeOHTZw4EDfYyoy919yySUmyfbt21fr+nlMSX+fSlPee4S0tDQLDw+3n3/+2e9xQ4cOtZdeesn3e6DjrbLvNQYNGmSDBg0KeHsAQJkyefccZBV5M1hQUGDx8fHWq1cvv+VFRUU2e/ZsM/vfH85Dhw751j/33HMmyb7++utS237ggQdMkmVnZ5faTqAOHTpkHo/H0tPTfcvy8/MtOjrabrzxRjP7X9Ddv3+/b5v/9//+n0nyC8affPKJSbLnn3/et+zCCy+0M844w2+fa9euNUl22223BVRDfn6+eTweu+iii/zaOf4Nc0WlpqaapBN+ygr7tf0YVORYlfXG/tdj6dg/Cr7//nvfsm7dutnZZ5/tt4/rr7/ewsLC7MiRIyccv7IQ9lEbVDXs5+fnW4MGDfxew2b/mxN+HUJ//XosLCy0K6+80l599VW/xwUyNwf6eg1UVeaE0aNHnxBOP/30U5Nk9913n29ZZUPw119/bZJsxYoVFe7X8coK+6Hu5zEVCfvHO/49wltvvWWSbNq0ab5tcnNz7aSTTrKioiIzq/x4qwjCPgAEVSan8YfQ2rVrlZOTo0suucRveXh4uMaNG1fq445d211YWFjuNsXFxVWuc8OGDcrPz9dpp53mWxYTE6PExMQyTz2NioqSJBUVFZ1QV1m1S1Lnzp3l9Xq1du3agGr4/vvvlZ+frwsvvLDiHSzHr+/Gb2ZlPjfHq43HoDqO1bF+/rpPhw8fPuHO3cXFxYqMjFR4eHjQ9g3UFevWrdOBAwfUtWtXv+XdunVTVFSU32nexxQXF2vo0KFq1qyZ3+n7UtXn5vLmoKoIdB9du3aVx+Mp9zKGQLRr107NmjXT8OHDNWXKFP34449VbrM8oehnMBz/HuGCCy7QySefrGeeecY3bz///PNKT0/3zdeVHW8AgNAh7IfQsesI4+Pjq9zWypUrdf7556tp06aKjo4u8Zrwyjp48KAk6Z577vH7nvnNmzcrPz8/aPs5XmRkpO8NVHk1bN26VZLUtGnTaqvnmNmzZ/u92alO1XEMaupY9e3bV59//rlefPFFHTp0SJ999pmWL1+uP/7xj4R91Es5OTmSpAYNGpywLj4+Xvv37z9h+c0336yNGzfqiSee0DfffOO3LlRzc7BFR0dr165dVW4nJiZG//rXv9SzZ09Nnz5d7dq1U3p6ug4dOhSEKqsuWP2sjPLeI7hcLo0ZM0Y//PCD3n77bUnSc889p2uvvda3jVPGGwDUJ4T9EGrZsqUk+W7EU1lbtmzRgAEDlJiYqNWrVys3N1czZswIRomS/hcKZ82a5fcJt5lp1apVQdvPrxUVFWnv3r1KTk4OqAa32y1JOnLkSLXUEwrVdQxq6lhNmTJFF1xwgUaMGKG4uDgNHDhQgwcP1t/+9rdq3S9QWx37x25JoT4nJ0dJSUknLB88eLDefPNNxcfH6+qrr/Y7SygUc3OwFRYWltr3yjj11FP18ssva9u2bZo4caIyMjL06KOPBqXtqgh2P8vz/vvva9asWZICf48wYsQIud1uPf3009qwYYPi4uLUpk0b33onjDcAqG8I+yGUkpKiRo0a6Y033qhSO1999ZUKCwt14403ql27dnK73UH9GrVjd5Ffs2ZN0NoszzvvvKOjR4/qrLPOCqiG0047TWFhYXrvvfdqrMbt27dr5MiR1dZ+dR2DmjpW69at06ZNm7Rr1y4VFhZqy5Ytmj9/vhISEqp1v0Btddppp6lBgwb67LPP/JavXr1aBQUF6tKlywmP6dWrl5o0aaKnnnpKn3/+uaZNm+ZbF4q5OdjeffddmZm6d+/uWxYREVGpSwy2bdvmO/uhadOmevDBB3XWWWedcEZEKASzn4H4/PPPFRsbKynw9wgJCQkaMmSIli9frkcffVTXXXed33onjDcAqG8I+yEUHR2tSZMm6f3339fYsWP1888/6+jRo9q/f3+F3pwc++T3rbfe0uHDh7Vx48YSr/2sLLfbrZEjR2rJkiWaP3++8vLyVFxcrK1bt2r79u1B2UdBQYFyc3NVVFSkL774QmPHjlWbNm00YsSIgGpo2rSp0tLStHTpUi1YsEB5eXlau3Ztid/n/uqrr1bpq/fMTIcOHdKyZcsUFxdXlW77qaljUJFjVRU333yzkpOTdeDAgaC2C9QVjRo10rZt2/Tjjz9q//79Cg8P14QJE/TCCy9o8eLFysvL01dffaUbbrhBLVq00OjRo0ttq1+/fhoxYoSmT5+uzz//XFLNzM3BdvToUe3bt09FRUVau3atxo8fr+TkZN88J0nt27fX3r17tXz5chUWFmrXrl3avHnzCW0df3w3b96sMWPGaP369SooKNCXX36pzZs3+wJ2Vef+2tLPsv5BUFhYqJ07d+rdd9/1hf2KvEe44YYbdOTIEa1YsUKXXXaZ37q6ON4AoN6rmRsB1h+q4N2azczmzp1rnTt3NrfbbW63284880ybN2+ezZgxw2JiYkySnXTSSbZp0yZbvHixJSQkmCRLSkry3ZF/4sSJ1qhRI4uPj7crrrjC5s6da5IsNTXVbr75Zl87rVu3tkWLFlW4X0eOHLGJEydacnKyRUREWNOmTS0tLc3WrVtns2fPNo/HY5IsJSXFPvjgA3vooYfM6/WaJGvevLn9/e9/t+eff96aN29ukiwhIcGWLFliZmYLFy60Xr16WbNmzSwiIsIaN25sV155pW3evDngGszM9u/fb6NGjbLGjRtbgwYNrGfPnjZ58mTfsfrPf/5jZmavvPKKNWzY0O+uw8d74YUXSr0T/69/7rnnHjOzOncMAtluzpw5lpiYaJLM4/FYv379bN68eb5+HhuTTz31lMXFxZkka9Omje+rAv/1r39Z48aN/Y5XZGSkdezY0ZYtW1ah8cfd+FEbVHR+/+KLL6xNmzYWExNjPXv2tB07dtjRo0ftkUcesZNOOskiIyMtISHBBgwYYBs2bPA9btmyZb55PiUlxbKzsy0vL89at25tkqxBgwb23HPPmVnZc0JFXq+BqOqcMHr0aIuMjLRWrVpZRESExcXFWf/+/W3Tpk1++9mzZ4/16tXL3G63tW3b1m655Ra7/fbbTZK1b9/e9/V1xx/f1atXW48ePSwhIcHCw8OtZcuWdvfdd/vuJh/I3P/xxx/bqaeeamFhYSbJEhMTbfr06bWqn48//nhAf59eeOEF377Keo/w668DNDM788wz7a677irx+JQ13n79nqWy7zW4Gz8ABFWmy+y422WjSlwulzIyMjR48OBQlwKE1Pz587Vx40bfdaPSL2cv3HnnnZo/f7727dunmJiYgNrKzMzUkCFDTri7P1CTmN+rZsyYMcrKytKePXtCXUq1quv9vPTSSzV37ly1bdu2xvd9xRVXSJKysrJqfN8A4EBZEaGuAIDz7NixQ2PHjj3h2s6oqCglJyersLBQhYWFAYd9AM4QjK+DrQvqUj8LCwt9X8W3du1aud3ukAR9AEDwcc1+PbV+/Xq/r84p7Sc9PT3UpaIOiomJUWRkpBYsWKCdO3eqsLBQ27Zt09NPP63JkycrPT09qPc7AFA5/C3AxIkTtXHjRn333XcaOXKkpk6dGuqSAABBwif79VSHDh04JRrVxuv16o033tD999+vk08+WQcPHlSDBg106qmn6qGHHtL1118f6hIBqOb+FkyaNEkLFy5UQUGB2rZtq0ceeUSDBg2q9v3WtLrYT4/How4dOqhVq1aaN2+eOnXqFOqSAABBwjX7QcY1nUDwcc0+agPmd6B6cc0+AARVFqfxAwAAAADgMIR9AAAAAAAchrAPAAAAAIDDEPYBAAAAAHAYwj4AAAAAAA5D2AcAAAAAwGEI+wAAAAAAOAxhHwAAAAAAhyHsAwAAAADgMIR9AAAAAAAchrAPAAAAAIDDEPYBAAAAAHAYwj4AAAAAAA7jMjMLdRFO4nK5Ql0C4FhMVwgl5neg+g0aNEhZWVmhLgMAnCArItQVOE1GRkaoS0CAZs2aJUm69dZbQ1wJgLqA+b1uWbVqlWbPns3zVse0bt061CUAgGPwyT7qrcGDB0uSMjMzQ1wJACDYMjMzNWTIEM4IAgDUV1lcsw8AAAAAgMMQ9gEAAAAAcBjCPgAAAAAADkPYBwAAAADAYQj7AAAAAAA4DGEfAAAAAACHIewDAAAAAOAwhH0AAAAAAByGsA8AAAAAgMMQ9gEAAAAAcBjCPgAAAAAADkPYBwAAAADAYQj7AAAAAAA4DGEfAAAAAACHIewDAAAAAOAwhH0AAAAAAByGsA8AAAAAgMMQ9gEAAAAAcBjCPgAAAAAADkPYBwAAAADAYQj7AAAAAAA4DGEfAAAAAACHIewDAAAAAOAwhH0AAAAAAByGsA8AAAAAgMMQ9gEAAAAAcBjCPgAAAAAADkPYBwAAAADAYQj7AAAAAAA4DGEfAAAAAACHIewDAAAAAOAwhH0AAAAAABwmItQFADVh9+7dysvL81t28OBBSdIPP/zgtzwuLk5NmjSpsdoAAFVz6NAhbd++3W/Zzp07JZ04x4eHh6tNmzY1VhsAAKHiMjMLdRFAdVuwYIFGjRoV0LZPP/20rr322mquCAAQLHv27FFiYqKKiorK3bZ379569dVXa6AqAABCKovT+FEvDBw4UJGRkeVuFxkZqYEDB9ZARQCAYGncuLEuuugihYWV/bbG5XIpPT29hqoCACC0CPuoFxISEtS7d29FRJR+5UpERIT69OmjhISEGqwMABAMw4cPV3knK0ZERKh///41VBEAAKFF2Ee9MXz4cBUXF5e6vri4WMOHD6/BigAAwXL55ZcrOjq61PURERHq16+fvF5vDVYFAEDoEPZRb/Tr108xMTGlrne73br00ktrsCIAQLDExsbq8ssvL/WSreLiYg0bNqyGqwIAIHQI+6g33G63BgwYUOIbwcjISKWlpcnj8YSgMgBAMAwbNkyFhYUlrouJiVGfPn1quCIAAEKHsI96ZejQoSW+ESwsLNTQoUNDUBEAIFh69+6tuLi4E5ZHRkZqyJAhcrvdIagKAIDQIOyjXrn44otLvAFffHy8/vCHP4SgIgBAsERGRmrw4MEnnMHFP3QBAPURYR/1SkREhNLT0xUVFeVbFhkZqaFDhwb01XwAgNqtpDO4GjdurF69eoWoIgAAQoOwj3rnyiuvVEFBge/3wsJCXXnllSGsCAAQLOedd56aNWvm+z0qKkrDhw9XeHh4CKsCAKDmEfZR7/Ts2VMtW7b0/Z6YmKhzzjknhBUBAIIlLCxMw4cP953BVVBQwD90AQD1EmEf9Y7L5fK9EYyMjNTVV18tl8sV6rIAAEHy6zO4kpKSdPbZZ4e4IgAAah5hH/XSsTeC3LQJAJyna9euatu2rSRpxIgR/EMXAFAvRRy/YNWqVZo5c2YoagFqVIMGDSRJ06ZNC3ElQPX785//rN/97nfV0jZ/N1AbxcTESJI++eQTXXHFFSGuBvCXlZUV6hIA1AMnfLL/008/aenSpaGoBahRbdq0UZs2bUJdBlDtli5dqp9++qna2ufvBmqj1q1by+v1Ki4uLtSlAD5bt25lvgRQY074ZP8Y/uMIp9u0aZMkKTU1NcSVANWrpk5h5u8GapvXX39dl1xySajLAHwyMzM1ZMiQUJcBoJ4oNewDTkfIBwBnI+gDAOozbtAHAAAAAIDDEPYBAAAAAHAYwj4AAAAAAA5D2AcAAAAAwGEI+wAAAAAAOAxhHwAAAAAAhyHsAwAAAADgMIR9AAAAAAAchrAPAAAAAIDDEPYBAAAAAHAYwj4AAAAAAA5D2AcAAAAAwGEI+wAAAAAAOEyVw363bt0UHh6u3/zmN+Vu+8orr8jr9erll18udZtRo0apYcOGcrlcWrNmTYUeW51Cvf9HH31UzZo1k8vl0hNPPFHhx8+YMUMdOnRQTEyMYmNj1aFDB917773Ky8sr9TFvvfWW7rrrrirvuyYF0s+XXnpJM2bMUHFxcaX2sWzZMrVr104ul8vvJyIiQk2aNNEf/vAHvfDCCyc8jvFfeYGMwWPj9fjnJzExUcOHDy93H//5z3+Unp6utm3bKjo6Wk2aNNEZZ5yhadOm+bZJT08/4Xkv7WfFihUn1HLvvfeWWcPMmTPlcrkUFhamDh066P3336/yeK0vQj1GK6IuzamoPqXN99Xp+DmpdevWWrBggW/9e++9p1atWvnmzqeeeqpG6gqk1kDncgDA/1Q57H/66afq1atXQNuaWbnbPP300/rb3/5WqcdWp1Dv/7bbbtNHH31U6cd/8MEHuu6667Rlyxbt3LlTU6dO1YwZMzRo0KASt//LX/6ixx57TJMmTaryvmtSIP3s16+f3G63LrzwQuXk5FR4H2lpafrhhx+Umpoqr9crM5OZadeuXcrIyNDPP/+stLQ0ZWRk+D2O8V955Y3BX4/X45+fHTt2aPHixWW2/9VXX6lHjx5KTEzUO++8o9zcXH300Ufq3bu33n33Xb9t33jjDeXk5KiwsFDbt2+X9MuYKigo0MGDB5Wdna3rrrtOkv9YkX55fgsLC0usobi4WI899pgk6YILLtD69ev1+9//vsrjtb4I9RitiLo0p6L6lDbfV6fj58effvpJ1157rW/973//e/Xt21fXX3+9tm/fruuvv75G6yur1kDmcgCAv6Cdxu9yucrd5tJLL1Vubq4uu+yyCrdflcdW1KFDh9SjR4+Q7b86REVF6aabblLTpk3VoEEDXXHFFerfv7/efPNNX2A55qGHHtLzzz+vzMxMNWzYsFL7K+kY1oRA+zlu3DidccYZ6tu3r4qKioKy74SEBF144YX6v//7P0lSZmam33rGf/UIxnh99NFHFR8fr9mzZyslJUVut1snn3yypk6dqpiYGN92LpdL55xzjrxeryIiIvyWR0ZGyuPxqGnTpurSpcsJ++jSpYt27Nih5cuXl1jDsmXL1KpVqxLXVcd4dZraPEbrolDN4Qido0eP6tprr1VkZKSeeOKJgN7XAQBqt6CF/cjIyGA1FfI/MAsWLFB2dnZIawi2F154QW6322/ZsWBx4MAB37Lvv/9e9957r+67774Ttq+IUB3DQPspSVOmTNGaNWs0e/bsoNaQkpIiSZX+FJbxH7hgjdc9e/YoNzdXe/fu9VseFRXld1r4kiVL5PF4ym1v9OjR+uMf/+i37MYbb5QkPf744yU+ZubMmZowYUKpbVbXeAVKUpfmgbos1PP9MUePHtWf/vQneTwezZ8/v9bUBQComqCF/e+//14dOnRQbGysYmJidO655+rDDz/0rf/www+VnJwsl8uluXPn+pabmR555BGdcsopio6Oltfr1e233+7XdkmPffjhh+XxeNSwYUNlZ2drwoQJatWqlTZs2KDi4mJNnjxZycnJiomJ0emnn37CKdWLFi1S165d5Xa7FRsbq5SUFE2dOlXjx4/XhAkTtGnTJrlcLrVv377M2mfOnKmOHTsqOjpaCQkJ6t+/v9avX+/bZv78+YqNjZXH49GLL76oPn36KC4uTklJSVqyZIlfTR988IE6deokr9crt9utzp076/XXX6/6k1OKjRs3Kj4+Xm3atPEte+yxx2Rm6tevX7mPf++993T22WfL4/EoLi5OnTt3Vl5eXonHcPbs2YqNjVVYWJi6dOmi5s2bKzIyUrGxsTrrrLN07rnnqnXr1nK73YqPj9cdd9xRrf2Ufvkk/rzzztPs2bN9pwC/9tpriouL0/Tp0yu9v7Vr10qSzjvvPN8yxn/1jP+KjNeydOvWTQcPHtQFF1ygf//731VqqzQXXHCBOnbsqHfeeUcbNmzwW/fvf/9b+fn5uvjii0t9fEnjta4aO3asoqKilJiY6Ft20003KTY2Vi6XS7t375YU+PgpaYx27NjRd/+DLl26KD8/X5J0xx13+MbYs88+K0llvmbKeq2VNgdK1Tufl7XfsvoS6PEsaR4IVtvHlDYHlbefiqipegOZC49tV958X17dZY3Hyv79Onr0qEaMGCGv1+s3xwerrvJeC2WN54oqa1+jRo3yXf+fmpqqL7/8UpI0cuRIeTweeb1evfTSS1XqKwDUOnacjIwMK2FxmS688EJr166d/fe//7XCwkL7+uuv7be//a253W777rvvfNv99NNPJsnmzJnjW3b33Xeby+Wyv/71r/8fe3ceF3W1/w/8NWwzrAMCgoqggAsquaRlpuWaLVe/KYqYXq/ezK1Ec8k185qmXC3cb5rLzRYFrOtaN5dyKzM1FUNFtASNFGQXEAZ4//7ox1xGthkYGBhez8eDP/jM+XzO+5zP+ZyZ98xnkbS0NMnJyZENGzYIALlw4UKl6wKQadOmybp162To0KFy9epVmTVrliiVStm9e7ekpaXJ/PnzxcLCQs6ePSsiIuHh4QJAli9fLikpKZKamiqbNm2SUaNGiYhIUFCQ+Pn56bSxrPoXLVokNjY28sknn0h6erpER0dLly5dxM3NTe7evVsqzqNHj0pGRoYkJSVJr169xN7eXvLz87XloqKiZPHixZKamiopKSnSvXt3cXV11b4eFxcnAORf//qXQfunpPz8fLlz546sW7dOlEqlfPLJJzqv+/r6Srt27Uqt92jdDx48ECcnJwkLC5Pc3Fy5e/euDB06VJKTk8vtw3feeUcAyJkzZyQ7O1vu378vzz//vACQgwcPSnJysmRnZ0toaKgAkIsXL9ZYO4vNmzdPZ6wdOHBAHB0dZcmSJZXW4efnJ2q1Wvt/Tk6OfP311+Lj4yPPPfecPHjwQKc8x7/xx39547Ws/VORnJwc6dq1qwAQANKuXTsJCwuTlJSUCtf7448/BID83//9X4Xl/Pz85LfffpM1a9YIAJk+fbrO60OGDJHt27dLVlaWAJB+/fqVuZ1Hx6u+AEhERIRB6xiiKu8bo0aNEg8PD51lK1euFADaeURE//Hz6BgtKCiQFi1aiLe3txQUFOjU8+abb0p4eLj2/8qOmbKOtXPnzlU4B9bUfF7Z3KtvWyrrz7LmMBKmsgAAIABJREFUAWNtu7I5qLJ69FVb8RoyF+oz31dlPF69erVK718FBQUyatQosba2ltjYWKP056NxVXQsVDaeS8aqj8qOu6CgILG0tJTff/9dZ71XXnlF9u3bV+226qMq8yURURVFGi3Z79ixo86y6OhoASCzZs3SLnv0w1hOTo7Y2dnJgAEDdNbduXOnQclObm6udllubq7Y2dlJSEiIdllOTo4olUqZMmWK5Ofni7Ozs/Tp00enzoKCAlm9erWI6Jfs5OTkiIODg049IiI//fSTANB5sy0rzuI3+Bs3bpTqz2LvvfeeAJCkpCQRMU6y7+HhIQDE1dVV1qxZo/OB5sGDB6JQKGTQoEGl1nu07l9++UUAyIEDB8qsp6JkPysrS7vs448/FgBy+fJl7bLiPty1a1eNtLOkbdu2CQDZsWOHwXX4+flpk8OSf4GBgfLxxx9LXl6eTnmOf+OO/4rGq4hhHxBF/vyCaM2aNdK2bVvtvmzcuLEcO3as3HUMTfbT09PF3t5eXFxcJCcnR0REbt68KV5eXpKXl1dpsl/V8WoOyX5l46esY6Q4SYuMjNQuy87OFm9vb8nIyBCRyo+Z8mKobA58lLHm84rqrWpbyurPR+cBY227sjlIn3r0UVvx6jsX6jvfVzVuQ/n5+Ymjo6OMHDlSunTpIgCkffv2pb6kLmbMuEoeC/ocR4bO5eXVJSJy5MgRASBLly7VlsnIyJBWrVppvxSs6X3AZJ+IalGk0U7jf1RgYCDUarX2lOay3LhxAzk5OejXr5/R6o2NjUVOTg46dOigXWZrawtPT09cu3YN0dHRSE9Px8CBA3XWs7S0xLRp0/SuJyYmBg8ePEDXrl11lnfr1g02NjY4c+ZMhevb2NgAQLl35gb+dx8EYz5y6/bt20hKSsLnn3+Ojz/+GJ07d9Zel5mUlAQR0euaZF9fXzRu3BijR4/G4sWLcevWrSrFU9wPJW86VtzuivqmMhW1s6Titt67d69K9ZS8G79Go8GdO3fw5ptvIjQ0FI899pj2dOSycPxXb/wbMl71YW1tjdDQUFy9ehU//vgjXn75ZSQlJWH48OFIS0szSh1qtRqvvPIK0tLSsGvXLgBAeHg4pkyZou2TilR3vJoLfcYP8Odpu2q1Wuc+B59++ilefvllODk5Aaj8mCmPoXOgsebziuqtalv06U9jbbuyOaiq9ZgqXn3nQn3ne2O1Xx85OTl49tlncf78eQwZMgQxMTEYP358jcdV8lgw1mcJfeoC/rycqnXr1ti2bZv2cqhdu3YhJCQElpaWAGp3HxAR1bQaS/aBPyfZij483LlzBwDg7u5utDqzs7MBAAsXLtR55nV8fDxycnK014E5OztXq57im685ODiUes3Z2RlZWVkGb/PgwYPo3bs33N3doVQqjXrdejFra2u4u7vjueeew65duxATE4P33nsPAPDw4UMAgFKprHQ7tra2+Pbbb9GzZ08sW7YMvr6+CAkJQW5urtFjroqK2llS8Z3Wi9teHVZWVmjWrBnGjRuHVatWITY2FsuXLy+3PMe/LkPHvyHj1VBPPvkk/vOf/2Dy5MlITk7Gd999Z7RtF9+o78MPP0R6ejqioqIwadIkvdY15nhtCBwcHDBhwgT88MMP+OmnnwD8eYPE0NBQbZnKjpnyVDYH1tR8XlG9VW2LPoy17crmIGPVU1vx6jsX6jvf1+Q+fJSDgwMmTpwIANi+fTt8fX2xa9cuhIeHGzWuio4FY3+WqOy4UygUmDRpEn799VccPXoUALBjxw6dxw/W5j4gIqppNZbsFxQUIDU1Fd7e3uWWKb57dl5entHqLX4jDQ8P1/7iWvx3+vRpNG3aFAAq/MVVH8Vv/GUlNenp6fDy8jJoewkJCRgyZAg8PT1x5swZZGRkICwsrFoxVsbf3x+WlpaIiYkB8L9EQt9fntq3b4/9+/cjMTERc+bMQUREBFatWlVj8VbVo+0sKT8/HwB0Hq9mDIGBgQCAK1eulFuG4/9/qjL+DR2vJZ04cULnA21QUFCZj7T761//CgBG/YDXqVMndO/eHT/99BMmTpyI4cOHw8XFRa91a2q8mrPQ0FBYW1sjPDwcJ06cQPPmzeHn56d9vbJjpiLlzYE1PZ+XV2912lIZY227sjnIWPXUVrz6zoX6zvc1uQ8rolarERUVpU2QT5w4YZS49DkWqvNZouRcru9xN3bsWKhUKmzZsgWxsbFwcnLSuYGvqfYBEVFNqLFk/7vvvkNRURG6dOlSbpkOHTrAwsICx48fN1q9xXd0v3jxYpmvt2jRAo0aNcKhQ4eqVU+HDh3g4OCAc+fO6Sw/c+YM8vPzy3zOdkUuX74MjUaDKVOmwNfXFyqVymiPvklJScErr7xSanlcXBwKCwvRvHlzAEDjxo2hUCiQkZFR6TYTExO1iay7uzuWL1+OLl26VJjc1jR921lScVs9PDyMGsv58+cBAG3atCm3DMf//1Rl/BsyXh91/vx52Nvba//Py8src+wW3135scceM7iOihT/ur979268+eabeq9XU+PVFKysrKp1qY6+vLy8EBwcjN27d+Ptt9/G9OnTdV6v7JgpT0VzYE3O5xXVW9W26MNY265sDjJWPbUVr75zob7zfU3uw8p06dIF4eHhKCgoQHBwMBITE6sdV2XHQnU/S5Scy/U97lxcXDBixAjs2bMHq1atwmuvvabzuin3ARGRsRkt2c/Pz0dGRgYKCgrw888/IzQ0FD4+Phg7dmy567i7uyMoKAi7d+/G1q1bkZmZiejoaGzevLnKcahUKowbNw47d+7Exo0bkZmZicLCQty5cwd//PEHlEol5s+fjxMnTiA0NBS///47ioqKkJWVpX1zadSoERITE3Hr1i1kZWWV+YFUpVJh5syZ+PLLL/Hpp58iMzMTly9fxuTJk9GkSRPtqXH6Kj4D4siRI3j48CHi4uIqve5ZX/b29jh06BC+/fZbZGZmQqPR4MKFC/jb3/4Ge3t7zJgxA8Cf1wP7+vpqTzesSGJiIiZNmoRr164hPz8fFy5cQHx8PLp37w5Avz40Nn3bWVJxW4t/if/6668NfnRRbm4uioqKICJITEzE9u3bsXDhQri5uVWYyHH8/09Vxr8h47WYRqPBvXv3cOzYMZ1kHwCGDBmCyMhIpKenIyMjA3v37sXcuXPxf//3f0ZP9oODg+Hm5oYhQ4bA19dX7/UeHa/1mb+/P1JTU7Fnzx5oNBokJycjPj6+RuqaOXMmCgoKkJaWhr59++q8VtkxU56K5sCanM8rqreqbSnLo/OApaWlUbZd2RxkrDYYazv6xKvPXKjvfF+duKvy/vWoyZMnY+TIkbh37x6GDx+unf+rGldlx0JlnyXKU9ZcbshxN3nyZOTl5eHAgQMYNGiQzmvGPI6IiEzu0Vv2VeUuodu3b5c+ffpI48aNxcrKSlxdXWXkyJESHx+vLbNu3Trx9PQUAGJnZyeDBw8WEZGsrCwZP368uLq6ioODg/Ts2VMWLVokAMTLy0suXbpU5rphYWFia2srAKR58+Y6j1bLy8uTOXPmiLe3t1hZWYm7u7sEBQVJTEyMtsz69eslMDBQVCqVqFQq6dy5s2zYsEFERH7++Wfx8fERW1tb6dmzpyxcuLDM2IuKimTlypXSqlUrsba2FhcXFxkyZIjO42s2bNggdnZ2AkBatWolN2/elM2bN4uTk5MAEB8fH+3jCefMmSONGjUSZ2dnGT58uKxfv14AiJ+fn0yfPl17h3l7e3sZOnSoQfto8ODB0rJlS3FwcBClUil+fn4SEhKicxd8EZHQ0FCxtrbW3ilcROT9998vVfetW7ekR48e4uLiIpaWltK0aVNZsGCB9m62j/bhvHnztP3QokULOXnypKxYsULUarUAEA8PD/nss89k165d2rpcXFxk586dNdLOYi+99JI0a9ZMioqKRETkq6++EkdHR5079T7qyy+/LPdO/EqlUlq1aiVTpkyRhIQE7Toc/zUz/ssarxXtn5J/X375pXadQ4cOyYgRI8TPz0+USqXY2NhImzZtZPHixfLw4cNSYyAzM1OeeeYZadSokQAQCwsL8ff3l2XLlpU7Vtzc3OSNN97QvvbWW2/JDz/8oP2/ZD9bWFhIu3bt5OTJkzrbe3S86gt18G78KSkp0qdPH1GpVNKyZUuZOnWqzJ49WwCIv7+/JCQk6D1+yju+SurTp49s2bKlzFgqOmbKO9YqmwNraj6vrN6K2mLI8fjoPHD37l2jbVuk4jlInzlMH7UVrz5zoYh+831Vx6NI1d6/vLy8ZP78+aXibNOmjfaJJFu3bq1WXBUdCydPnix3PFdlLq+orpLvySIinTt3lnnz5hk8dipqqz54N34iqkWRCpH/fzvS/y8yMhIjRozAI4upgbhx4wYCAgKwfft2jB492tTh1KiUlBR4eXlh6dKlmDlzpqnDoSrgeNWPQqFAREQEgoODayQ2vm8QUX3z0ksvYf369WjZsmWt1sv5kohqUVSN3o2f6h9/f38sWbIES5YswYMHD0wdTo1avHgxOnXqpHNnbqpfOF6JiEgfJS9Ji46OhkqlqvVEn4iotjHZr6euXbum80iY8v5CQkIM3va8efMwfPhwhISEVOnmZ8ZUU+384IMPcPHiRXz11Vfa5/BS/VSXxmtN4Xg1bzU5n5sT9hNVx5w5cxAXF4fr169j3LhxePfdd00dEhFRjbMydQBUNW3btq3RU8CWLVuGQ4cOYfny5VixYkWN1VOZmmjn3r17kZeXh2PHjsHS0tKo2ybTqCvjtSZwvJq/mp7PzQX7iarDzs4Obdu2RbNmzbBhwwa0a9fO1CEREdU4XrNPRGTmeM0+EVHdwPmSiGoRr9knIiIiIiIiMjdM9omIiIiIiIjMDJN9IiIiIiIiIjPDZJ+IiIiIiIjIzDDZJyIiIiIiIjIzTPaJiIiIiIiIzAyTfSIiIiIiIiIzw2SfiIiIiIiIyMww2SciIiIiIiIyM0z2iYiIiIiIiMwMk30iIiIiIiIiM8Nkn4iIiIiIiMjMMNknIiIiIiIiMjNW5b0wfPjw2oyDiAyQkZEBhUIBJycnU4dCpMX3DfOTmZkJEYFarTZ1KERm4c6dO6YOgYgakFLJfvPmzTFs2DBTxEJEerpx4wZu3bqF5s2bo127dnBwcDB1SFSHDRs2DM2bN6+x7fN9w/xkZWXh6tWruH37Nlq0aIHHH3/c1CERmQUvLy/Ol0RUaxQiIqYOgogMIyLYvXs33n77bdy8eRMjR47E4sWL4evra+rQiKgeS0hIwLJly7Bt2zb4+/tj3rx5GDVqFCwtLU0dGhERERkmitfsE9VDCoUCw4cPx5UrV/D555/j9OnTCAgIwMSJE5GYmGjq8Iionrlz5w6mTZuG1q1b45tvvsGGDRtw+fJljBkzhok+ERFRPcVkn6ges7Cw0Cb969atw8GDB9GqVStMmzYN9+7dM3V4RFTH3b9/H3PnzkXr1q3xn//8B2FhYYiNjcWECRNgZVXubX2IiIioHuBp/ERmJD8/H//+97/xzjvv4MGDB3j99dcxd+5cODs7mzo0IqpDUlNTsXbtWoSHh0OpVGLmzJmYNm0aVCqVqUMjIiIi44hisk9khrKzs7F+/XqEhYVBoVBg6tSpmDFjBu/eT9TAPXjwABs2bMCKFStgaWmJ2bNnY+rUqbCzszN1aERERGRcTPaJzFlWVhY2btyI9957DzY2Npg1axZ/vSNqgEp+AVhQUIApU6Zg3rx5fKQeERGR+WKyT9QQ3L9/H6tWrcLatWvh5uaGmTNnYtKkSVAqlaYOjYhqEC/tISIiarCY7BM1JElJSfjggw+wevVqeHp6Yv78+Xj11Vd5t20iM1Oc5P/jH/9Aeno6xo8fj/nz58PDw8PUoREREVHt4KP3iBqSxo0bY8WKFbh+/ToGDhyI119/HYGBgYiKigK/9yOq/zQaDXbs2IGAgABMnToVf/nLX3Djxg2sWbOGiT4REVEDw2SfqAHy9vbGpk2bcPnyZXTt2hUjR45Ex44dERUVZerQiKgKioqKEBUVhfbt22P8+PF4+umnce3aNWzatAlNmjQxdXhERERkAkz2iRqwtm3bYseOHbh06RLatm2LESNG4KmnnsLRo0dNHRoR6UFEsH//fnTp0gUhISHo1KkTrly5gh07dqBly5amDo+IiIhMiMk+EaF9+/aIjIzE6dOn4ebmhv79+6Nnz544ceKEqUMjonIcOXIEXbt2xcsvv4zWrVvjypUriIyMhL+/v6lDIyIiojqAyT4RaT355JPYv38/Tp06BWtrazz77LMYMGAAzp8/b+rQiOj/O3LkCJ544gkMGDAAjRo1wrlz5xAZGYk2bdqYOjQiIiKqQ5jsE1EpTz/9NL777jscPnwY6enp6NatGwYNGoTo6GhTh0bUYH3//ffo27cvBgwYALVajbNnz+Lw4cPo3LmzqUMjIiKiOojJPhGVq3///jh79iwOHTqEO3fuoHPnzggODsaNGzdMHRpRg/Hjjz9i0KBB6NmzJ/Lz83Hs2DEcPnwYXbt2NXVoREREVIcx2SeiSvXv3x/nz5/Hrl27cOnSJbRr1w5jxozBb7/9ZurQiMzW5cuXERwcjKeeegopKSk4cuQITp06hWeffdbUoREREVE9wGSfiPRiYWGB4cOH4+rVq/jss8/w/fffo23btpg4cSLu3r1r6vCIzMaVK1cQHByMjh07Ij4+Hvv27cMPP/yAfv36mTo0IiIiqkeY7BORQUom/evWrcOBAwfg7++PuXPnIi0tzdThEdVbv/32GyZOnIjHHnsMV69eRUREhPYUfiIiIiJDMdknoiqxsbHBhAkTEBcXh2XLluHf//43fHx8MHfuXGRkZJg6PKJ6Iz4+HhMnTkTr1q1x8uRJbNu2DZcuXcLw4cOhUChMHR4RERHVUwoREVMHQUT134MHD7BhwwasWLEClpaWmD17NkJDQ2Fra2vq0IjqpDt37mDlypXYtGkTmjRpgnnz5uHVV1+FpaWlqUMjIiKi+i+KyT4RGVVqairWrl2L8PBwKJVKzJw5E9OmTYNKpTJ1aER1QnJyMt5//32sXbsWbm5uWLhwIf7+97/DysrK1KERERGR+WCyT0Q14/79+1i1ahXWrFmDxo0bY8GCBUxoqEFLSUnBunXr8MEHH8DW1hYzZszgF2FERERUU5jsE1HNun37NlatWoVNmzahadOmmDt3Lk9VpgYlKysLGzduxPLly2FtbY1Zs2bxEhciIiKqaUz2iah2xMfH47333sPWrVvRtm1bvPPOOxg2bBhvQEZmKzs7G+vXr0dYWBgUCgWmTp2KGTNmwMnJydShERERkfmL4t34iahW+Pj4YNOmTbh8+TLatWuHESNGoHv37ti/f7+pQyMyqpycHKxZswb+/v5YunQpJkyYgJs3b2Lx4sVM9ImIiKjWMNknoloVEBCAyMhIXLp0CT4+Phg8eDCefvppfPfdd6YOjaha8vPzsXnzZrRq1QoLFixAcHAwbt68iRUrVsDZ2dnU4REREVEDw2SfiEwiMDAQkZGROH36NGxtbdG3b18MGDAAZ8+eNXVoRAbRaDTYsWMHAgICMHXqVPzlL39BXFyc9uaURERERKbAZJ+ITKp79+44cuQITp48ifz8fDzxxBMYMGAALly4YOrQiCpUVFSEqKgotG/fHuPHj0f//v3x66+/YtOmTWjSpImpwyMiIqIGjsk+EdUJPXv2xPHjx3H48GGkpaWha9euCA4OxvXr100dGpEOEUFUVBTatWuHkJAQdOrUCVevXsWmTZvQrFkzU4dHREREBIDJPhHVMf3798fZs2exZ88exMbGIiAgQHvtM5GpHTlyBI8//jhCQkLw2GOP4erVq4iMjISfn5+pQyMiIiLSwWSfiOochUKBQYMG4cKFC9i1axcuXLiAgIAATJw4EYmJiaYOjxqgI0eO4IknnsBzzz2HZs2a4fz584iMjETr1q1NHRoRERFRmZjsE1GdZWFhgeHDh+PKlSvYsmULDh8+DF9fX0ycOBH37t0zdXjUAJw6dQp9+vTBgAEDoFarcfbsWezfvx+dOnUydWhEREREFWKyT0R1nrW1NcaMGYNr165h7dq12LdvH/z9/TF37lykp6ebOjwyQ6dPn0b//v3Rq1cvaDQa7f0kHn/8cVOHRkRERKQXJvtEVG/Y2NhgwoQJuHHjBhYuXIjNmzfDz88PixcvRmZmpqnDIzMQHR2N4OBg9OjRA7m5uTh69ChOnTqFZ555xtShERERERmEyT4R1Tv29vaYM2cO4uPj8dZbb2H16tXw8/NDWFgYcnNzTR0e1UMxMTEIDg5Gp06dkJCQgH379uH7779H3759TR0aERERUZUw2SeiesvR0RFz5szBzZs38eqrr+If//gHWrdujTVr1iAvL8/U4VE9cO3aNYwZMwYdO3bE1atXERERgdOnT2PQoEGmDo2IiIioWpjsE1G95+rqihUrViA+Ph6jRo3C3Llz0aZNG2zevBmFhYV6bePhw4c1HCXVtPz8fL3LxsfHY+LEiQgMDMS5c+ewbds2XLp0CcOHD4dCoajBKImIiIhqB5N9IjIb7u7uWLFiBWJjYzFw4EC8/vrrCAwMxI4dO1BUVFTueiKC/v3749ixY7UXLBlVVlYWevfujZiYmArL3b59G9OmTUObNm1w6NAhbNiwAZcvX8aYMWNgYcG3RCIiIjIf/GRDRGbH29sbmzZtwvXr19GrVy/8/e9/R8eOHREVFQURKVX+yy+/xPfff48XX3wRP/30kwkipurIzc3Fiy++iNOnT+Ptt98us0xycjLmzp2L1q1bY8+ePVi7di3i4uIwYcIEWFpa1nLERERERDVPIWV98iUiMiMxMTH4xz/+gd27d+PJJ5/E/PnztddkFxYWIiAgADdv3oRCoYC9vT1OnTqFwMBAE0dN+sjPz8egQYPw7bffoqCgAAqFAj///DM6deoEAEhJScHKlSuxbt06ODg4YMaMGZg2bRpUKpWJIyciIiKqUVH8ZZ+IzF779u0RGRmJH3/8EW5ubhg8eDB69uyJ48eP4/PPP8eNGzdQVFSEwsJC5OTkoHfv3rh27Zqpw6ZKFBYWYvTo0Th69CgKCgoAAFZWVliwYAGysrIQFhYGPz8/bN26FYsWLcKtW7cwZ84cJvpERETUIPCXfSJqcE6cOIGFCxfi5MmTcHZ2RmZmps41/dbW1nB1dcXp06fRokUL0wVK5RIRvPbaa/j3v/9d5k0YHRwcoFQqMXv2bLzxxhuwt7c3QZREREREJhPFZJ+IGqxp06Zh3bp1ZV7Hb21tjaZNm+LHH3+Ep6enCaKjisyYMQNr1qwp88aL1tbWaNmyJc6ePQsnJycTREdERERkcjyNn4gapry8PERGRpb7ukajQWJiIvr06YPU1NRajIwqM2/ePKxevbrcJyxoNBpcv34dFy5cqOXIiIiIiOoOJvtE1CB9+OGHSE5OLvNX/WIajQY3b97EgAEDkJWVVYvRUXmWLVuGFStWVLjfgD+v3Z87d24tRUVERERU9/A0fiJqcLKzs+Hj44OUlBS9yltZWeHpp5/Gf//7X97czYTWrl2LadOmGbTO4cOH0b9//xqKiIiIiKjO4mn8RNTwfP3111AoFNr/FQoFbGxsYGVlVWb5goICfP/99xg2bBg0Gk1thUklbNu2DdOnTy/39bL2oZWVFT7++OPaCI+IiIiozuEv+0QlVHQNN5mf3Nxc3Lt3D3fv3sXdu3fxxx9/IDExEX/88Yf2tH2FQgErKysUFBRARPDUU09h2rRpOl8WUM364YcfsHbtWogIrKysUFRUpL1e38rKCq6urmjWrBmaNm0KDw8PeHp6wsPDA25ubrC0tDRx9FRbgoODTR0CERFRXcK78ROVxASOiKh+4scZIiIiHVFln7NK1IBFRETwFyKqUE5ODqytrWFtbW3qUMxeZmYmH59HFYqMjMSIESNMHQYREVGdw2SfiMhAdnZ2pg6hwWCiT0RERFQ1vEEfERERERERkZlhsk9ERERERERkZpjsExEREREREZkZJvtEREREREREZobJPhEREREREZGZYbJPREREREREZGaY7BMRERERERGZGSb7RERERERERGaGyT4RERERERGRmWGyT0RERERERGRmmOwTERERERERmRkm+0RERERERERmhsk+ERERERERkZlhsk9kAsuXL4darYZCocDFixdNHY7exo0bB5VKBYVCgYcPH5pNHN26dYOlpSU6depU5W189dVXUKvV2L9/f7llxo8fD0dHxzq3343R/vLo2+byyunTr7UtNjYWU6dORfv27eHo6AgrKyuo1Wq0bt0aL730Ek6fPm3qEImIiIiY7BOZwrx587Bp0yZTh2Gw7du3Y9asWaYOw+hxnD17Fn369KnWNkSk0jJbtmzBRx99VK16aoIx2l8efdtcXjl9+rU2bd26FYGBgYiOjsYHH3yA27dvIzs7GxcuXMC7776L9PR0XL582dRhEhEREcHK1AEQ1We5ubno168ffvjhB1OHQkagUCiqvO5LL72EjIwMI0ZT+6rT/ppSl/r1xx9/xMSJE/Hss8/im2++gZXV/95CfX194evrC2dnZ8TFxZkwyoqZcs7ifElERFS7mOwTVcPWrVuRlJRk6jBMoq4khsaMw9ra2mjbKk9d6bey1FT79W1zbfSNiGD37t1IS0vDhAkTDFp36dKlKCwsxPLly3US/ZIGDhyIgQMHGiPUGmHKOashz5dERESmwNP4iapo+vTpmDlzJm7evAmFQgF/f38AfyYTH3zwAQICAqBUKuHi4oKXX34Z165dq3B79+7dQ4sWLWBlZYXnn39eu7ywsBCLFi2Ct7c3bG1t8dhjjyEiIgIAsHHjRtjb28POzg579+7FCy+8ACcnJ3h5eWHnzp1Vbtsnn3yCrl27QqXF6Wi6AAAgAElEQVRSwd7eHi1atMC7776rfd3CwgIHDx7ECy+8ALVajSZNmmDbtm062zh58iTatWsHtVoNlUqFwMBAfPPNNwCAf/7zn7Czs4OjoyOSkpIwc+ZMNGvWDLGxsQbFWVkc48ePh0KhgEKhgJ+fHy5cuADgz2v+7ezsoFarsW/fPm35GzduoG3btrC3t4etrS169eqFU6dOaV8vL+6tW7fC29sbCoUC69ev15YXEaxcuRJt2rSBUqmEWq3G7NmzDWpjSRWNhdWrV8Pe3h4WFhZ4/PHH4eHhAWtra9jb26NLly7o1asXmjdvDpVKBWdnZ7z11lultl9Z+yuLwZA261Pu1KlTpfrVkDFfWFiI9957D23atIGtrS3c3NzQsmVLvPfeewgODtaW++9//wsnJycsW7as3L7Pz8/H0aNH4erqiieeeKLccmW1s7L5wNDjuKLjs6Ljrrw5y1hzjLHrJiIiomoSItICIBEREXqXDwoKEj8/P51lixYtEhsbG/nkk08kPT1doqOjpUuXLuLm5iZ3797Vltu5c6cAkAsXLoiISH5+vgQFBcnevXt1tjdr1ixRKpWye/duSUtLk/nz54uFhYWcPXtWREQWLFggAOTo0aOSkZEhSUlJ0qtXL7G3t5f8/HyD+yA8PFwAyPLlyyUlJUVSU1Nl06ZNMmrUqFL1paenS2pqqrz44ouiVColOztbu52oqChZvHixpKamSkpKinTv3l1cXV21rxdvZ9q0abJu3ToZOnSoXL16Ve849Y0jKChILC0t5ffff9dZ/5VXXpF9+/Zp/+/Xr5/4+vrKb7/9JhqNRn755Rd58sknRaVSyfXr1yuN+/bt2wJA1q1bp1NWoVDI+++/L2lpaZKTkyMbNmzQ2e+GqGwsvPPOOwJAzpw5I9nZ2XL//n15/vnnBYAcPHhQkpOTJTs7W0JDQwWAXLx40eD26zMe9WmzvuXK61d9xvyyZcvE0tJS9u7dKzk5OXL+/Hnx8PCQ3r176/TrgQMHxNHRUZYsWVJu31+/fl0ASPfu3Q3aZ/rOB/q2qbLjs7Ljrqw5y1hzTE3UrY+IiAjhxxkiIqJSIvnuSFRCdZP9nJwccXBwkJCQEJ1yP/30kwDQSSZKJvsajUZGjhwpX3/9tc56ubm5Ymdnp7O9nJwcUSqVMmXKFBH53wfx3NxcbZnipOnGjRt6t0Xkzy8cnJ2dpU+fPjrLCwoKZPXq1eXWt2PHDgEgv/zyS7nbfu+99wSAJCUllbsdQ+gbx5EjRwSALF26VLssIyNDWrVqJQUFBdpl/fr1k44dO+rUER0dLQBk1qxZFdYrUjopzcnJETs7OxkwYIBOuUe/5NGXPmOhONnPysrSlvn4448FgFy+fFm7rHg87tq1y6D2VxaDvm02pG8qSvYrG/PdunWTJ554QqeOCRMmiIWFheTl5Ykhzp07JwCkf//+eq9jyHygT5v0OT4f9ehx9+icVZNzjDHq1geTfSIiojJF8jR+IiOKiYnBgwcP0LVrV53l3bp1g42NDc6cOVNqncLCQrzyyito3Lixzun7wJ+P+MrJyUGHDh20y2xtbeHp6VnhZQE2NjYAAI1GY1D80dHRSE9PL3XNsaWlJaZNm1buesXXeldUX3GZwsJCg2IyRFlx9O3bF61bt8a2bdu0d3bftWsXQkJCYGlpWeH2AgMDoVarER0dbXAsN27cQE5ODvr162fwumWp7lgoKCjQLtNnfwGl219ZDPq22dh9A5Q95h8+fFjqbv6FhYWwtraudN8/ysHBAQCQk5Oj9zpVmQ9KerRNVTk+KzvuanKOqam6iYiISD9M9omMKD09HcD/EoOSnJ2dkZWVVWr5G2+8gbi4OHz44Ye4cuWKzmvZ2dkAgIULF2qvPVcoFIiPjzco6dBXZmamNtbqOnjwIHr37g13d3colcoyrxGvDQqFApMmTcKvv/6Ko0ePAgB27NiBV199Va/1ra2tDf7SBADu3LkDAHB3dzd43bLU9lgoVrL9lcWgb5uN3TflefHFF3H+/Hns3bsXubm5OHfuHPbs2YO//OUvBif7LVq0gEqlwvXr1/VepyrzQUX0OT4NPe6MOa5MWTcRERGVxmSfyIiKP4SX9SE+PT0dXl5epZYHBwfj8OHDcHZ2xpgxY3R+gS1OhsLDwyEiOn+nT582evxNmzYFANy/f79a20lISMCQIUPg6emJM2fOICMjA2FhYcYIsUrGjh0LlUqFLVu2IDY2Fk5OTvDx8al0vYKCAqSmpsLb29vgOlUqFQAgLy/P4HXLUttjASjd/spi0LfNxu6b8ixevBh9+/bF2LFj4eTkhKFDhyI4OBgfffSRwdtSKpUYOHAg7t+/j++//77ccqmpqRg/fjyAqs0HFans+KzKcWescWXKuomIiKhsTPaJjKhDhw5wcHDAuXPndJafOXMG+fn5ePzxx0ut06dPH7i5uWHz5s04f/48li5dqn2t+O7pFy9erPHYgT9/vWzUqBEOHTpUre1cvnwZGo0GU6ZMga+vL1QqlUkfOefi4oIRI0Zgz549WLVqFV577TW91vvuu+9QVFSELl26GFxnhw4dYGFhgePHjxu8bllqeywApdtfWQz6ttnYfVOemJgY3Lx5E8nJydBoNEhISMDGjRvh4uJSpe0tXrwYSqUSM2bMQG5ubpllfvnlF+1j+aoyH1SksuOzKsedscaVKesmIiKisjHZJ6qGRo0aITExEbdu3UJWVhYsLS0xc+ZMfPnll/j000+RmZmJy5cvY/LkyWjSpAkmTpxY7rYGDx6MsWPHYtmyZTh//jyAP38BHTduHHbu3ImNGzciMzMThYWFuHPnDv744w+jt0epVGL+/Pk4ceIEQkND8fvvv6OoqAhZWVmlLjGoSPEvwUeOHMHDhw8RFxdX6fXJNW3y5MnIy8vDgQMHMGjQoDLL5OfnIyMjAwUFBfj5558RGhoKHx8fjB071uD63N3dERQUhN27d2Pr1q3IzMxEdHQ0Nm/eXKX4a2MsVNb+ymLQt83G7pvyvPHGG/D29saDBw8qLPf1119X+ug9AOjUqRM+++wz/PLLL+jVqxe++uorZGRkQKPR4LfffsNHH32EV199VXutukqlqvJ8UJbKjk99jruy5ixjjCtT1k1ERETlqM3bARLVdTDwbvw///yz+Pj4iK2trfTs2VPu3r0rRUVFsnLlSmnVqpVYW1uLi4uLDBkyRGJjY7XrffHFF+Li4iIApEWLFpKUlCSZmZnSvHlzASAODg6yY8cOERHJy8uTOXPmiLe3t1hZWYm7u7sEBQVJTEyMbNiwQezs7ASAtGrVSm7evCmbN28WJycnASA+Pj46j03T1/r16yUwMFBUKpWoVCrp3LmzbNiwQcLCwsTW1lanvk8//VTbFi8vL+2d8OfMmSONGjUSZ2dnGT58uKxfv14AiJ+fn7zxxhva7TRv3lw++eQTg+IzJI6SOnfuLPPmzStzm9u3b5c+ffpI48aNxcrKSlxdXWXkyJESHx9fZr0l4163bp14enoKALGzs5PBgweLiEhWVpaMHz9eXF1dxcHBQXr27CmLFi3Sxnjp0iWD2l3RWFi9erV2LLRo0UJOnjwpK1asELVaLQDEw8NDPvvsM9m1a5d4eHgIAHFxcZGdO3fq3f7KYjCkzfqUK6tfDRnz3377rbi6ugoA7Z+1tbUEBATIF198oW3TV199JY6OjjpPbKhIQkKCzJo1SwIDA8XBwUEsLS3F2dlZOnfuLK+++qp8//332rL6zAeGHsflHZ8iFR93CQkJZc5ZxppjjF23vng3fiIiojJFKkQeuVUxUQOmUCgQERGB4OBgU4dCNeCll17C+vXr0bJlS1OHQrVg48aNiIuLQ3h4uHZZfn4+5s6di40bNyItLQ22trYmjJCMITIyEiNGjCj15AUiIqIGLsrK1BEQEdUUjUajPaU6OjoaKpWKiX4DcffuXYSGhpa6HtzGxgbe3t7QaDTQaDRM9omIiMhs8Zp9IjN37do1ncdalfcXEhJidnHOmTMHcXFxuH79OsaNG4d33323BltguPqyb+ojW1tbWFtbY+vWrbh37x40Gg0SExOxZcsWLFq0CCEhIXBycjJ1mEREREQ1hr/sE5m5tm3b1ovTW2siTjs7O7Rt2xbNmjXDhg0b0K5dO6Nuv7rqy76pj9RqNQ4dOoQlS5agdevWyM7OhoODA9q3b48VK1ZgwoQJpg6RiIiIqEbxmn2iEnjNPhFR/cJr9omIiMoUxdP4iYiIiIiIiMwMk30iIiIiIiIiM8Nkn4iIiIiIiMjMMNknIiIiIiIiMjNM9omIiIiIiIjMDJN9IiIiIiIiIjPDZJ+IiIiIiIjIzDDZJyIiIiIiIjIzTPaJiIiIiIiIzAyTfSIiIiIiIiIzw2SfiIiIiIiIyMww2SciIiIiIiIyM0z2iYiIiIiIiMyMlakDIKprTp8+beoQiIhIT5yziYiIyqYQETF1EER1hUKhMHUIRERUBfw4Q0REpCOKv+wTlcAPi2SugoODAQCRkZEmjoSIiIiIagOv2SciIiIiIiIyM0z2iYiIiIiIiMwMk30iIiIiIiIiM8Nkn4iIiIiIiMjMMNknIiIiIiIiMjNM9omIiIiIiIjMDJN9IiIiIiIiIjPDZJ+IiIiIiIjIzDDZJyIiIiIiIjIzTPaJiIiIiIiIzAyTfSIiIiIiIiIzw2SfiIiIiIiIyMww2SciIiIiIiIyM0z2iYiIiIiIiMwMk30iIiIiIiIiM8Nkn4iIiIiIiMjMMNknIiIiIiIiMjNM9omIiIiIiIjMDJN9IiIiIiIiIjPDZJ+IiIiIiIjIzDDZJyIiIiIiIjIzTPaJiIiIiIiIzAyTfSIiIiIiIiIzw2SfiIiIiIiIyMww2SciIiIiIiIyM0z2iYiIiIiIiMwMk30iIiIiIiIiM8Nkn4iIiIiIiMjMMNknIiIiIiIiMjNM9omIiIiIiIjMDJN9IiIiIiIiIjPDZJ+IiIiIiIjIzDDZJyIiIiIiIjIzVqYOgIiIjOv48eP48ccfdZZdu3YNABAWFqazvHv37nj22WdrLTYiIiIiqh0KERFTB0FERMZz+PBhPPfcc7C2toaFRdkncBUVFUGj0eDQoUMYMGBALUdIRERERDUsisk+EZGZKSwshIeHB1JSUios5+LigqSkJFhZ8SQvIiIiIjMTxWv2iYjMjKWlJUaNGgUbG5tyy9jY2OCvf/0rE30iIiIiM8Vkn4jIDI0cORL5+fnlvp6fn4+RI0fWYkREREREVJt4Gj8RkZny8fFBQkJCma95eXkhISEBCoWilqMiIiIiolrA0/iJiMzV6NGjYW1tXWq5jY0N/va3vzHRJyIiIjJjTPaJiMzU6NGjodFoSi3Pz89HSEiICSIiIiIiotrCZJ+IyEwFBAQgICCg1PK2bduiQ4cOJoiIiIiIiGoLk30iIjM2ZswYnVP5ra2t8be//c2EERERERFRbeAN+oiIzFhCQgJatGiB4qleoVDg119/RYsWLUwbGBERERHVJN6gj4jInHl7e6Nr166wsLCAQqFAt27dmOgTERERNQBM9omIzNyYMWNgYWEBS0tL/PWvfzV1OERERERUC3gaPxGRmUtOTkaTJk0AAL///js8PDxMHBERERER1bAoK1NHQERUFcOHD8fu3btNHUa94+npaeoQ6o1hw4YhKiqqRrbN8UtEFeH8Q0SGKus3fCb7RFRvde/eHW+++aapw6gXjh8/DoVCgWeeecbUodQL4eHhNV4Hxy8RlYXzDxEZ4vTp01i9enWZrzHZJ6J6y8vLC8HBwaYOo154/vnnAQBOTk4mjqR+qKlf1Eri+CWisnD+ISJDMdknImrAmOQTERERNSy8Gz8RERERERGRmWGyT0RERERERGRmmOwTERERERERmRkm+0RERERERERmhsk+ERERERERkZlhsk9ERERERERkZpjsExEREREREZkZJvtEREREREREZobJPhEREREREZGZYbJPREREREREZGaY7BMRERERERGZGSb7RERERERERGaGyT4RERERERGRmWGyT0QN1vjx4+Ho6AiFQoGLFy+aOpxqKSoqQnh4OHr06FFumc8//xzdunWDo6MjfHx8MG7cONy9e9fgur744gv4+vpCoVDo/NnY2KBx48bo3bs3Vq5cibS0tOo0iUpYtWoVGjduDIVCgQ8//FC7/KuvvoJarcb+/ftNGF3d0K1bN1haWqJTp06mDsUs9suPP/6IgIAAWFhYQKFQwMPDA0uXLjV1WDoenYs8PT0xevRoU4dF1VTefFeddev6MVnX49MH54y6ick+ETVYW7ZswUcffWTqMKotLi4OzzzzDGbMmIGcnJwyy0RERGDUqFEYPnw47ty5g7179+LEiRN44YUXUFBQYFB9QUFB+PXXX+Hn5we1Wg0RQVFREZKSkhAZGYmWLVtizpw5aN++Pc6dO2eMJjZ4s2bNwg8//FBquYiYIJq66ezZs+jTp4+pwwBgHvule/fuuHr1Kp577jkAQGxsLBYuXGjiqHQ9OhfdvXsXn376qanDomoqb76rzrp1/Zis6/Hpg3NG3cRkn4ioHrt06RLmzp2LyZMnV/iL5qZNm9C0aVPMnj0barUanTp1wowZM3Dx4kWcOXOm2nEoFAo4Ozujd+/e2L59OyIjI3Hv3j289NJLyMjIqPb2qWzF/Tto0CBTh1JnKBQKU4dQp/ZLbm5uhWf81Cfm1Jb6pjp9Xxf2W106JstSl+KrC/vLWMypLVXFZJ+IGrS6kBhUR8eOHfHFF19g1KhRUCqV5Za7ffs2mjRpotPe5s2bAwDi4+ONHtewYcMwduxYJCUlGXwaJlF1WFtbmzqEOmXr1q1ISkoydRhGYU5tqW+q0/fcb/WLOe0vc2pLVTHZJ6IGQ0SwcuVKtGnTBkqlEmq1GrNnzy5VrrCwEIsWLYK3tzdsbW3x2GOPISIiAgCwceNG2Nvbw87ODnv37sULL7wAJycneHl5YefOnTrbOX78OJ544gnY2dnByckJgYGByMzMrLSOmuDr61vqDa/4en1fX1/tsv/+979wcnLCsmXLql3n2LFjAQBff/21dpk59q2pnDp1Ct7e3lAoFFi/fj0Aw/rQGP2kb32hoaGwsbGBp6endtnrr78Oe3t7KBQK3L9/HwCwevVq2Nvbw8LCAo8//jg8PDxgbW0Ne3t7dOnSBb169ULz5s2hUqng7OyMt956q1RMN27cQNu2bWFvbw9bW1v06tULp06d0rvt//znP2FnZwdHR0ckJSVh5syZaNasGWJjY/Xqk+rsl7Vr10KlUqFx48aYNGkSmjRpApVKhR49euicgaNvf06fPh0zZ87EzZs3oVAo4O/vD6B6x3lda4uhTp48iXbt2kGtVkOlUiEwMBDffPMNgD/v41J8La+fnx8uXLgAABg3bhzs7OygVquxb98+ADU7huqa8vpeRPDBBx8gICAASqUSLi4uePnll3Ht2rVK161oPxhbTc+VFb0f1XR8nDMMb4uh6v2cIURE9dCwYcNk2LBhBq2zYMECUSgU8v7770taWprk5OTIhg0bBIBcuHBBW27WrFmiVCpl9+7dkpaWJvPnzxcLCws5e/asdjsA5OjRo5KRkSFJSUnSq1cvsbe3l/z8fBERefDggTg5OUlYWJjk5ubK3bt3ZejQoZKcnKxXHVXx5JNPSseOHct87dixY2JtbS1r166VzMxM+eWXXyQgIEAGDhyoU+7AgQPi6OgoS5YsqbQ+Pz8/UavV5b6emZkpAKR58+baZfWlb6syvgxRle3HxcUJAPnXv/6lXXb79m0BIOvWrdMu06cPRYw3BvWtb9SoUeLh4aGz7sqVKwWAdt+JiLzzzjsCQM6cOSPZ2dly//59ef755wWAHDx4UJKTkyU7O1tCQ0MFgFy8eFG7br9+/cTX11d+++030Wg08ssvv8iTTz4pKpVKrl+/rnfbi9s0bdo0WbdunQwdOlSuXr2qd59UZ79MnDhR7O3t5cqVK/Lw4UOJiYmRbt26iaOjoyQkJBjcn0FBQeLn56dTzpDjfODAgQJA0tLS6mRbRCqfi0qKioqSxYsXS2pqqqSkpEj37t3F1dVVpw5LS0v5/fffddZ75ZVXZN++fdr/a3IM1cX5p6y+X7RokdjY2Mgnn3wi6enpEh0dLV26dBE3Nze5e/duhetWth/Kmu/0VZtzZWXvR/rinFF7bRExvzkjIiJCyknrI/nLPhE1CLm5uQgPD0f//v0xY8YMODs7w9bWFo0aNdIp9/DhQ2zcuBFDhgxBUFAQnJ2dsXDhQlhbW2P79u06ZXv06AEnJye4u7sjJCQE2dnZSEhIAADcunULmZmZaN++PVQqFTw8PPDFF1/Azc3NoDqM5dlnn8WcOXMQGhoKJycndOjQAVlZWdiyZYtOuZdeegmZmZl4++23q11n8ZMOsrKyAJhv39ZVFfVhTfRTRfVVRbt27WBnZwdXV1eMHDkSAODt7Q03NzfY2dlp76Bc8ldE4M9x16JFC1hZWaF9+/b46KOP8PDhQ2zevNngtq9YsQJvvPEGvvjiC7Rt27bKbSlJn36ysrLS/lrarl07bNy4EVlZWUYbw8Y6zutCWww1bNgwvPPOO3BxcUGjRo0wePBgpKSkIDk5GQAwefJkFBYW6sSXmZmJs2fP4sUXXwRg+jFUF+Tm5uKDDz7A0KFDMXr0aKjVagQGBuLDDz/E/fv3tcdbeSrbD7WpOnNlRe9HtRFfMc4ZNae+zxlM9omoQbhx4wZycnLQr1+/CsvFxsYiJycHHTp00C6ztbWFp6dnqaSiJBsbGwCARqMB8Oep8Y0bN8bo0aOxePFi3Lp1q9p1VMeCBQuwefNmHD16FA8ePMCvv/6KHj164KmnnsLt27drpM7s7GyICJycnACYb9/WB4/2YU3306P1GWt7JZ8cUXxtfmV1BAYGQq1WIzo6GkDdGiP69lPXrl1hZ2dXp8dwfW1L8TgqLCwEAPTt2xetW7fGtm3btHdI37VrF0JCQmBpaQmgbo0hU4mJicGDBw/QtWtXneXdunWDjY2NwTd+fXQ/mIqhc2VF70e1EV956tpxVpb62pb6Nmcw2SeiBuHOnTsAAHd39wrLZWdnAwAWLlyo8wz5+Pj4ch9rVxZbW1t8++236NmzJ5YtWwZfX1+EhIQgNzfXaHXo648//kBYWBgmTJiAvn37wt7eHi1btsRHH32ExMRErFy50uh1AsD169cBQPvNtDn2bX3V0PrJ2tpa+4GyvrZdqVSa5FfPmmDKthw8eBC9e/eGu7s7lEplqfs+KBQKTJo0Cb/++iuOHj0KANixYwdeffVVbZn6OoaMKT09HQDg4OBQ6jVnZ2ftGV3lqWw/1BWV7euK3o9MjXOGcdT3OYPJPhE1CCqVCgCQl5dXYbniLwPCw8MhIjp/p0+fNqjO9u3bY//+/UhMTMScOXMQERGBVatWGbUOfcTFxaGwsBBNmzbVWe7k5IRGjRohJibG6HUCf97QBwBeeOEFAObZt/VVQ+qngoICpKamwtvbG0D9bLtGo0F6ejq8vLxMHUq11XZbTpw4gfDwcABAQkIChgwZAk9PT5w5cwYZGRkICwsrtc7YsWOhUqmwZcsWxMbGwsnJCT4+PtrX6+MYMjZnZ2cAKDOpr2z/6rsf6gJ99nV570emxDmj6sxtzrCq0a0TEdURHTp0gIWFBY4fP47JkyeXW674Tt8XL16sVn2JiYlIT09Hu3bt4O7ujuXLl+PQoUO4cuWK0erQV/Eb5B9//KGzPCsrC6mpqdpH8BnT3bt3ER4eDi8vL/z9738HYJ59W1+Zop+srKyMdlq/Ib777jsUFRWhS5cuAEzT9uo6duwYRATdu3fXLjNVf1ZXbbfl/PnzsLe3BwBcvnwZGo0GU6ZM0T6FpKzHr7q4uGDEiBHYtWsXHB0d8dprr+m8Xh/HkLF16NABDg4OOHfunM7yM2fOID8/H48//ni56+q7H+qCyvZ1Re9HpsQ5o+rMbc7gL/tE1CC4u7sjKCgIu3fvxtatW5GZmYno6OhSNxFSqVQYN24cdu7ciY0bNyIzMxOFhYW4c+dOqWS5IomJiZg0aRKuXbuG/Px8XLhwAfHx8ejevbvR6tBXy5Yt0adPH3z00Uc4ceIEcnNzcfv2bUycOBEAdE41+/rrrw16vI6I4MGDBygqKoKIIDk5GREREXj66adhaWmJPXv2aK/ZN8e+ra9M0U/+/v5ITU3Fnj17oNFokJycjPj4eKPXk5+fj4yMDBQUFODnn39GaGgofHx8tI+CrA9jpKioCGlpaSgoKEB0dDSmT58Ob29vbRsA/fuzUaNGSExMxK1bt5CVlQWNRmPwcV6X21IejUaDe/fu4dixY9oP7sVndxw5cgQPHz5EXFxcudeWT548GXl5eThw4AAGDRqk81p9GEPG9mjfW1paYubMmfjyyy/x6aefIjMzE5cvX8bkyZPRpEkT7ftLWes2adIEgH77wdQq29cVvR/VJs4ZnDPKpdf9/ImI6piqPDooKytLxo8fL66uruLg4CA9e/aURYsWCQDx8vKSS5cuiYhIXl6ezJkzR7y9vcXKykrc3d0lKChIYmJiZMOGDWJnZycApFWrVnLz5k3ZvHmzODk5CQDx8fGR69evy61bt6RHjx7i4uIilpaW0rRpU1mwYIEUFBRUWochTp8+LU8//bQ0adJEAAgA8fT0lB49esjx48e15e7fvy/Tp08Xf39/USqV4uDgIE8//bT85z//0dneV199JY6OjrJ06dJy69y3b5889thjYmdnJzY2NmJhYSEARKFQiLOzszzxxBOyZMkSSUlJKbVufenbuvboq/fff188PDwEgNjb28vQoUNl3bp14unpKQDEzs5OBg8erHcfihinnwypLyUlRfr06SMqlUpatmwpU6dOldmzZwsA8ff3l4SEBFm9erV2e/a0Pg0AACAASURBVC1atJCTJ0/KihUrRK1WCwDx8PCQzz77THbt2qXtDxcXF9m5c6eIiGzfvv3/sXff8VVU+f/H3ze9kRB6TYAgBEgQkRJYUCzYWF2kNxGVFWyAggZB+UYEpLgCCogU2QX9IQm6Yt9F3UVR4CsIUhOaEDALoQQCJJB2fn/4zV1jCklIMrmT1/Px4A8m586878ycm/vJnDljbrnlFlOnTh3j4eFhatasaQYPHmyOHj2aJ3dR733WrFnG19fX+djIVatWFXt/GGOu+biMGjXKeHp6moYNGxoPDw8TGBhoevfubQ4dOpRnO8XZn8YY8+OPP5rQ0FDj6+trunXrZk6cOFGsfr5582bTpk0bZ/+uV6+emT59eqV6L2+++aYJCwtzfvYV9u+DDz5wbis6OtrUqFHDVK9e3fTv398sWLDASDJhYWF5Hu1ljDE33HCDef755wvcP+V5DlW2zx9jCj6PcnJyzJw5c8x1111nPD09TXBwsLn//vtNQkLCVV9b1HEYN25cvs+74qroz8qr/T4qDj4z+My41s+Moh695zDm/6YNBAAX0r9/f0lSXFycxUlgR+V9fnH+ojCjR49WXFyczpw5Y3WUa+bq76VXr15asGCBmjZtWqHb5fMHJeHq/ey3XP29WPWZERsbq4EDB6qAsj6OYfwAAACViNWPHytLrvRefjvEd+fOnfLx8anwL+1AabhSP7saV3ovrvCZQbEPAJVIfHx8nseyFPZv0KBBVkeFTXEO5sc+qRqio6N14MAB7d+/Xw899JBefvllqyNBrtn/XDEzSs4VPjOYjR8AKpHw8PCChmEBFYZzML+K2ieTJk3SihUrlJGRoaZNm2rOnDnq169fuW+3PLjie/Hz81N4eLgaNmyohQsXqnXr1lZHglzzM4nPjJJzxffiCp8Z3LMPwCVxzyHKE/fMArAKnz8ASoJ79gEAAAAAqEIo9gEAAAAAsBmKfQAAAAAAbIZiHwAAAAAAm6HYBwAAAADAZij2AQAAAACwGYp9AAAAAABshmIfAAAAAACbodgHAAAAAMBmKPYBAAAAALAZin0AAAAAAGyGYh8AAAAAAJuh2AcAAAAAwGY8rA4AAKW1du1aORwOq2PApvr161eu6+f8BVAYPn8AlAWHMcZYHQIASmrTpk06duyY1TFc2r59+xQTE6MFCxaodu3aVsepdBo3bqwuXbqUy7o5f/Fbc+fOlSQ9/fTTFidBZcHnD35r48aNWrhwoWbPnq3GjRtbHQeV1IABA36/KI5iHwCqqPT0dAUFBelvf/ubBg8ebHUcoMrK/YIWGxtrcRIAlU1GRoZat26tm266SW+//bbVceBa4rhnHwCqKF9fX0VGRmrLli1WRwEAAAVYvHixfvnlF8XExFgdBS6IYh8AqrAuXbpo8+bNVscAAAC/c/HiRc2YMUNPPPGEQkJCrI4DF0SxDwBVWOfOnbV9+3ZdvnzZ6igAAOA3/vKXv+jy5ct6/vnnrY4CF0WxDwBVWFRUlDIyMrR9+3arowAAgP9z+vRpvfbaa3r22WdVs2ZNq+PARVHsA0AV1rx5c9WqVYuh/AAAVCLTpk2Tj4+PxowZY3UUuDAPqwMAAKzjcDjUuXNnJukDAKCSOHr0qBYvXqy5c+eqWrVqVseBC+PKPgBUcZ07d+bKPgAAlcSLL76oBg0a6JFHHrE6ClwcxT4AVHFRUVE6evSofvnlF6ujAABQpe3atUvvvvuuZsyYIS8vL6vjwMVR7ANAFde5c2e5ubkxlB8AAItNmjRJkZGRGjBggNVRYAMU+wBQxQUGBqpVq1YU+wAAWGjjxo365JNPNHv2bLm5Uabh2nEWAQDUpUsX7tsHAMBCEydO1E033aQ77rjD6iiwCWbjBwCoc+fOevfdd5WZmSlPT0+r4wAAUKV89NFH+u677/T9999bHQU2wpV9AICioqKUnp6uXbt2WR0FAIAqJScnR1OmTFHfvn3VpUsXq+PARij2AQBq3bq1goKCGMoPAEAFW7VqlXbv3q2XX37Z6iiwGYp9AIDc3NzUsWNHJukDAKACZWRkaOrUqXr44YfVqlUrq+PAZij2AQCSfh3Kz5V9AAAqzsKFC5WUlKQXX3zR6iiwIYp9AICkXyfpO3DggE6fPm11FAAAbO/ixYuaOXOmxowZo8aNG1sdBzZEsQ8AkCTnpEAM5QcAoPzNnj1bGRkZio6OtjoKbIpiHwAgSapZs6bCwsIo9gEAKGenTp3SvHnzFB0drRo1algdBzZFsQ8AcOrSpQv37QMAUM5eeuklBQQEaMyYMVZHgY1R7AMAnDp37qwtW7YoJyfH6igAANjSkSNHtHTpUsXExMjPz8/qOLAxin0AgFNUVJRSU1O1b98+q6MAAGBLkydPVpMmTfTQQw9ZHQU2R7EPAHBq27at/Pz8GMoPAEA52Llzp9577z1NmzZNnp6eVseBzVHsAwCcPD09deONNzJJHwAA5SA6Olrt27dXv379rI6CKsDD6gAAgMolKipKX3zxhdUxAACwlW+//VZffPGF1q9fL4fDYXUcVAFc2QcA5NG5c2ft2bNH58+ftzoKAAC2MXHiRPXs2VO333671VFQRVDsAwDy6NKli3JycvTDDz9YHQUAAFv4+9//rk2bNunll1+2OgqqEIp9AEAeDRo0UOPGjZmkDwCAMpCdna0XXnhB/fv3V+fOna2OgyqEe/YBAPl06dKFSfoAACgDf/3rX7V//369//77VkdBFcOVfQBAPp07d9amTZtkjLE6CgAALuvy5cuaOnWqRo4cqfDwcKvjoIqh2AcA5BMVFaUzZ87o0KFDVkcBAMBlLViwQKdOndLkyZOtjoIqiGIfAJDPjTfeKG9vb+7bBwCgGLKzs/MtO3/+vGbOnKlx48apUaNGFqRCVcc9+wCAfLy9vXX99ddry5Ytuummm7R582Zt3rxZ3333nYYMGaKxY8daHRFwSadPn1ZqamqeZZcuXZIkHT58OM/ywMBA1apVq8KyASi9qKgoPfTQQ/rzn/8sT09PSdLs2bOVk5OjCRMmWJwOVZXDcEMmAOD/pKWladu2bdq8ebOWLVumY8eOKT09XQ6HQ15eXrpy5Yree+89DRw40OqogEtavny5Ro4cWay2y5Yt0yOPPFLOiQBcqwsXLigoKEjGGIWEhGjmzJm66aab1LJlS8XExFDswypxFPsAAO3du1dDhgzR7t27lZ2dLU9PT2VnZysnJydf202bNikqKsqClIDrS0lJUd26dZWZmVlkO09PT508eVLBwcEVlAxAaW3dulUdO3aUJLm5uckYo1q1aiknJ0fHjh2Tr6+vxQlRRcVxzz4AQK1atVK1atWcs+9nZmYWWOhLUmhoaEVGA2wlODhYd911lzw8Cr+T0sPDQ3fffTeFPuAi4uPj5eb2a1mVk5MjY4zOnj2rM2fOqEePHvr2228tToiqimIfACCHw6Hly5c7v6wUxtPTU/Xq1augVIA9DRs2rMDJvHJlZ2dr2LBhFZgIwLXYv3+/8z79XLl9/Mcff9TNN9+sP/3pT9q3b58V8VCFUewDACRJLVq00HPPPSd3d/dC29SvX18Oh6MCUwH2c9999xU5rNfHx0e9evWqwEQArsW+ffsKvTUnKytLxhh99NFHev755wsdNQeUB4p9AIDTCy+8oAYNGhRa8Ddr1qyCEwH24+Pjo/vvvz/flUDp19Ezffv2lZ+fnwXJAJTGzp07iyzi3dzcNHjwYMXFxV11BB1QljjbAABOvr6+Wrp0aYFDjD08PBQWFmZBKsB+hgwZUuCVwMzMTA0ZMsSCRABKIycnR0eOHCn05w6HQ4899pjeeeedAv/AB5Qnin0AQB533nmn+vTpk+9Libu7O5PzAWXkjjvuKHACvurVq+v222+3IBGA0jhy5IgyMjIK/flzzz2nBQsWcEUfluCsAwDks2DBAnl5eeVZlpWVRbEPlBEPDw8NGjQoTz/z9PTUkCFDuPoHuJD4+PgCl7u5uWnx4sWaOXNmBScC/otiHwCQT/369TVt2rQ8VyKys7PVpEkT60IBNjN48OA8VwQzMzM1ePBgCxMBKKn4+Pg8f7Rzc3OTh4eHVq9erVGjRlmYDKDYBwAU4qmnnlLr1q3zPA+cK/tA2enWrZsaNGjg/H+9evX0hz/8wcJEAEoqISFBxhhJv97u5uXlpU8++UQDBgywOBlAsQ8AKIS7u7uWL1/unKzPzc0tT2EC4No4HA4NGzZMXl5e8vT01PDhw3m0JeBidu/erczMTHl4eMjf319ff/217rzzTqtjAZIo9gEARejUqZNzGGKdOnW4lxgoY7lD+ZmFH3BNuffs16pVS1u2bFGXLl0sTgT8l8fVmwCA63nttde0adMmq2PYQmZmpry8vJSRkaH+/ftbHadKeuaZZ1zqCyT9r2QCAgIkSdOmTbM4ievo0qWLnnnmGatjVHp8ZpevzMxMnT17VgEBAWrfvr1efPFFqyNVOFf7/VTVcGUfgC1t2rRJmzdvtjqGLXh6eqpdu3bOggQVa+3atTp27JjVMUqE/lcyoaGhzIdRAps3b+aPScW0du1aHT9+3OoYtnXhwgUFBwfrlltukZ+fn9VxKpwr/n6qariyD8C2oqKiFBcXZ3UM2/jqq6902223WR2jynHVe7jpf8V36NAhSVJYWJjFSVwDV6tL5umnn2ayuHISHx+vRo0aVdk/hrvq76eqhGIfAFAsFPpA+aDIB1xTeHi41RGAIjGMHwAAAAAAm6HYBwAAAADAZij2AQAAAACwGYp9AAAAAABshmIfAAAAAACbodgHAAAAAMBmKPYBAAAAALAZin0AAAAAAGyGYh8AAAAAAJuh2AcAAAAAwGYo9gEAAAAAsBmKfQAAAAAAbIZiHwAAAAAAm6HYB4BCjBw5UtWqVZPD4dCOHTusjlOpXL58WeHh4XrhhRdK/Nr3339fzZo1k8PhyPPPy8tLderUUY8ePTRnzhylpKSUQ3K4Cjv1v5ycHM2dO1ddu3YttE1mZqZmzJih5s2by8vLS9WrV1dERISOHDlSom3Rv1CWqlI/7NGjR75+k/svICCgRNuiH6KyoNgHgEIsW7ZMS5cutTpGpTR58mQlJCSU6rV9+/bV4cOHFRYWpqCgIBljlJOTo+TkZMXGxqpp06aKjo5WmzZttHXr1jJODldhl/534MAB3XTTTXrmmWeUlpZWaLuBAwdq5cqVevfdd5WWlqZ9+/YpLCxMFy9eLNH26F8oS1WtHxamW7duJWpPP0RlQbEPAFVEenp6kVcWi+v777/X7t27yyDRfzkcDlWvXl09evTQihUrFBsbq5MnT6pXr146f/58mW7LCmW17+FafvrpJ02cOFGPPfaY2rVrV2i79957Tx9++KHi4uLUuXNneXh4qH79+lq3bp0iIiKuOYfd+xdQlOL2Qx8fH6WmpsoYk+ffqFGj9Nxzz11zDvohrECxDwBFcDgcVkcoM8uXL1dycvI1rSM9PV3PPvus5s2bV0apCtavXz+NGDFCycnJWrx4cbluqyKUxb6vily9/11//fV6//33NXToUHl7exfa7s0331T79u0VGRlZIbns1r9QvqpKP/ziiy9UrVq1PMuOHTum3bt369Zbby3zXPRDVASKfQD4P8YYzZkzRy1btpS3t7eCgoL07LPP5mkze/Zs+fn5qVq1akpOTtb48ePVsGFDJSQkyBij1157Ta1atZK3t7eCg4PVu3dvxcfHO1//+uuvy8fHR3Xq1NHo0aNVv359+fj4qGvXrtqyZUu+PFdb35gxY+Tl5aV69eo5lz3xxBPy9/eXw+HQ6dOnJUnjxo3T+PHjdejQITkcDjVv3rxU+2jy5Ml64oknVLt27QJ//sUXXygwMFDTp08v1fp/a8SIEZKkzz//XBL73u6K0/8kKTs7W1OmTFFISIh8fX3Vtm1brVmzRpK0aNEi+fv7y8/PT+vWrdPdd9+twMBANWrUSKtXr86zng0bNqhTp07y8/NTYGCgIiMjlZqaetVtlLWMjAxt3ry5yCuOucqzf0n227couaraDwszc+ZMjR07Ns8y+iFcigEAG+rXr5/p169fiV4zefJk43A4zF/+8heTkpJi0tLSzMKFC40ks3379jztJJmxY8eaN954w/Tp08fs27fPTJkyxXh5eZlVq1aZc+fOmZ07d5r27dubWrVqmRMnTjhfP2rUKOPv72/27t1rLl++bPbs2WM6duxoqlWrZhITE53tiru+oUOHmrp16+Z5L3PmzDGSzKlTp5zL+vbta8LCwkq0T35r48aN5r777jPGGHPq1CkjyUyePDlPm08++cRUq1bNTJ069arrCwsLM0FBQYX+PDU11UgyjRs3di6rivteklmzZk2pXmuV8ux/EyZMMN7e3mbt2rUmJSXFTJo0ybi5uZkffvjBuR5J5quvvjLnz583ycnJpnv37sbf399kZGQYY4y5ePGiCQwMNLNmzTLp6enmxIkTpk+fPs5jdrVtlEbnzp3N9ddfn2/5zz//bCSZdu3amR49eph69eoZb29vEx4ebhYsWGBycnKcbcu7f7nKvi3N+VVVlfTzo6r2w4IcP37ctG7d2mRnZ+dZTj/8L1f8/VTFxFLsA7Clkn4ZTEtLM35+fqZnz555lq9evbrQYj89PT3P6wMCAsygQYPyvP5///d/jaQ8XwpGjRqV75f/Dz/8YCSZl156qcTrq4iCMy0tzXTo0MEcP37cGFN4sV8SV/sSZIwxDofDVK9e3fn/qrjvXfHLVHn1v/T0dOPn55fn2KSlpRlvb2/z+OOPG2MKPkdyi5WDBw8aY4zZvXu3kWQ++eSTfFmKs43SKKzI2LVrl5Fkevbsab777jtz5swZc+7cOTNx4kQjybzzzjul2l5J+5cr7VuK/eIryedHVe6HBXnyySfNm2++WeptGWPvfmiMa/5+qmJiGcYPAJIOHjyotLQ03XbbbaV6/Z49e3Tx4kV16NAhz/KOHTvKy8sr3zDx3+vQoYP8/Pycw8SvdX1lbdKkSXr00UfVsGHDCtvmpUuXZIxRYGBgke3svu+rguL2v4SEBKWlpeWZtM7X11f16tXLc4vF73l5eUn69fF2ktSsWTPVqVNHw4YNU0xMTJ7H25V2G6WVew9xmzZt1LVrV9WoUUNBQUF66aWXFBQUpCVLlpT5NqX8/cuO+xYlU5X74e8lJSXpo48+cg6zLy/0Q5Q3in0AkHT8+HFJKvRe9Ks5d+6cJBX4LN7q1avrwoULV12Ht7e3Tp06VWbrKysbN27Url27NHLkyArbpiTt379fkhQeHl5kOzvv+6qiuP3v0qVLkqQXXnghz7Orjx49WqLHafn6+urrr79Wt27dNH36dDVr1kyDBg1Senp6mW2juOrXry9Jzjkecnl5eSk0NFSHDh0q821K+fuXHfctSqYq98PfmzVrlv785z/Lx8enXLdDP0R5o9gHAMn5C/3KlSulen316tUlqcBC8Ny5c2rUqFGRr8/MzMzT7lrXV5aWL1+ur776Sm5ubs4vBblfBqdPny6Hw1Euzwn+4osvJEl33313ke3svO+riuL2v9zzbu7cufkej7Vp06YSbbNNmzb6+OOPlZSUpOjoaK1Zs0avvvpqmW6jOAICAnTddddp7969+X6WlZWloKCgMt+mlL9/2XHfomSqcj/8rRMnTuj//b//p8cff7xctyPRD1H+KPYBQFJERITc3Ny0YcOGUr8+ICAgX9G7ZcsWZWRk6MYbbyzy9f/+979ljFFUVFSJ1+fh4eEculceVqxYke8LQe5V8MmTJ8sYk2/I+7U6ceKE5s6dq0aNGunhhx8usq2d931VUdz+17hxY/n4+GjHjh3XtL2kpCRncV27dm298sorat++vfbu3Vtm2yiJgQMHavv27Tp8+LBzWVpamo4ePVouj+MrqH/Zdd+i+Kp6P8w1a9YsDRs2TDVq1CjX7dAPUREo9gFAv/4y7Nu3r9auXavly5crNTVVO3fuLPb9sj4+Pho/frw++OADvfPOO0pNTdWuXbv02GOPqX79+ho1alSe9jk5OUpJSVFWVpZ27typcePGKSQkxHl/YEnW17x5c509e1YffvihMjMzderUKR09ejRfxho1aigpKUlHjhzRhQsXyqVI/fzzz0v0SCJjjC5evKicnBznHxHWrFmjP/zhD3J3d9eHH3541Xv22feur7j9z8fHRw899JBWr16tRYsWKTU1VdnZ2Tp+/Lj+85//FHt7SUlJGj16tOLj45WRkaHt27fr6NGjioqKKrNtlMQzzzyj0NBQjRgxQomJiTpz5oyio6OVnp6uiRMnOtuVZ/+y675F8VX1fihJJ0+e1Ntvv62nn3660Db0Q7iUsp/0DwCsV5rZmi9cuGBGjhxpatasaQICAky3bt3MlClTjCTTqFEj89NPP5lZs2YZX19f56NyVq1a5Xx9Tk6OmTNnjrnuuuuMp6enCQ4ONvfff79JSEjIs51Ro0YZT09P07BhQ+Ph4WECAwNN7969zaFDh/K0K+76zpw5Y2655Rbj4+NjmjZtap566inz7LPPGkmmefPmzkfK/fjjjyY0NNT4+vqabt265XmEXEkVNhv/Z599ZqpVq2amTZtW6Gs/+ugj07ZtW+Pn52e8vLyMm5ubkeSckbhTp05m6tSp5syZM3leV1X3vVxwtuPy6n/GGHPlyhUTHR1tQkJCjIeHh6ldu7bp27ev2bNnj1m4cKHx8/Mzksx1111nDh06ZJYsWWICAwONJBMaGmr2799vjhw5Yrp27WqCg4ONu7u7adCggZk8ebLJysq66jZKYtOmTeYPf/iDqV+/vpFkJJl69eqZrl27mg0bNuRpe+zYMTN48GATHBxsvL29TadOncznn3+ep0159i9X2rfMxl98Jf38qOr98JlnnjHDhg0rcn30w/9yxd9PVUyswxhjKuoPCwBQUfr37y9JiouLszhJfqNHj1ZcXJzOnDljdZQqxxX3vcPh0Jo1azRgwACroxRbZe5/cH2cX8Xnip8fcB2cX5VeHMP4AcAC2dnZVkeostj3AACgKqDYB4AqKD4+Ps8jdwr7N2jQIKujApUG/QawHv0QKD4PqwMAQFUyadIkrVixQhkZGWratKnmzJmjfv36VXiO8PBwVbW7uCrLvofrqor9Bqhs6IdA8VHsA0AFmjFjhmbMmGF1jCqJfQ8AAKoShvEDAAAAAGAzFPsAAAAAANgMxT4AAAAAADZDsQ8AAAAAgM1Q7AMAAAAAYDMU+wAAAAAA2AzFPgAAAAAANkOxDwAAAACAzVDsAwAAAABgMxT7AAAAAADYDMU+AAAAAAA2Q7EPAAAAAIDNUOwDAAAAAGAzHlYHAIDysnnzZvXv39/qGHAhZ8+elbu7u4KCgqyO4vLofygvmzdvVlRUlNUxXMbcuXMVFxd3zes5f/68MjMzVatWrTJIBaAiUOwDsKUuXbpYHQEu6ODBg0pMTFS9evXUsmVL1a5d2+pI6tevnxo3bmx1jBKh/5XM1q1bJUkdOnSwOIlriIqK4hwrpn79+l3T640xSkpK0sGDB3Xq1Ck1bNiQYh9Orvj7qapxGGOM1SEAAKgsNm7cqFmzZumTTz7RDTfcoHHjxmno0KFyd3e3OhpsasCAAZKk2NhYi5MAvzp//rz++te/at68eUpMTNStt96qMWPG6I9//KMcDofV8QAUTxz37AMA8BvdunXTxx9/rG3btikiIkIPP/ywWrRoofnz5ys9Pd3qeABQbhISEjR27Fg1bNhQL774ou644w7t3r1b69ev17333kuhD7gYin0AAArQvn17rVy5UgkJCfrjH/+o559/Xk2aNFFMTIzOnj1rdTwAKBM5OTn68ssvde+996pVq1b67LPP9OKLLyoxMVFvvfWWWrVqZXVEAKVEsQ8AQBHCwsI0f/58HTlyRI899phef/11hYaGauzYsUpMTLQ6HgCUSmpqqubPn6/mzZvrzjvv1OXLl7Vu3Trt379f0dHRql69utURAVwjin0AAIqhTp06iomJUWJioqZNm6a///3vat68uYYPH67du3dbHQ8AimX//v0aO3asGjRooBdffFE9e/bUrl27GKoP2BDFPgAAJRAQEKCxY8fq4MGDWrZsmX788UdFRkY67/UHgMrmt0P1w8PDnUP1jx49qrfeekutW7e2OiKAckCxDwBAKXh5eWn48OHatWuXPvroI/n6+uq+++7TjTfeqJUrVyo7O9vqiACquNTUVC1ZskRt2rTRHXfcocuXL2vNmjWKj49XdHS0goODrY4IoBxR7AMAcA0cDofuvfderV+/Xlu3blWbNm308MMPq2XLlszgD8ASBw4ccA7VnzBhgm666SbnrPr9+/fnUaJAFUGxDwBAGcm9qp+QkKBevXpp4sSJzOAPoEL8dqh+y5Yt9emnnzJUH6jiKPYBAChjRc3gf+zYMavjAbCR3w7V79mzp1JSUrRmzRolJCQwVB+o4ij2AQAoJ3Xr1lVMTIyOHj2qadOm6YMPPlBYWJiGDx+uPXv2WB0PgAs7cOCAJk6cqNDQUI0fP945VH/jxo0M1QcgiWIfAIByV61aNY0dO1aHDh3SsmXLtG3bNkVGRqpnz57M4A+g2H4/VH/t2rWaOHGic6h+mzZtrI4IoBKh2AcAoILkzuC/e/durVu3Tunp6czgD+CqLly4oCVLligiIqLAofo1atSwOiKASohiHwCACpY7g//GjRuZwR9AoQ4ePKiJEycqJCRE48ePV/fu3bVr1y6G6gMoFop9AAAslHtVPz4+Pt8M/ikpKVbHA1DBfjtUv0WLFoqLi8szVD8iIsLqiABcBMU+AACVQPPmzfPM4D9//nznDP7Hjx+3Oh6AcpY7VD93Po/cofr79+9nqD6AUqHYBwCgEsmdwT8xMVEvv/xynhn89+7da3U8AGUsd6h+7h/3brzxRobqAygTFPsAS3i1cwAAIABJREFUAFRCv53Bf+nSpdq6dasiIiJ077336rvvvrM6HoBrYIzRl19+qQEDBig8PFyxsbGKjo7WL7/8opUrVzJUH0CZoNgHAKAS+/0M/mfPnlW3bt3UoUMHZvAHXMzvh+onJSVp9erVDNUHUC4o9gEAcAFubm7Oq/rffvut6tevrxEjRig8PFzz58/X5cuXrY4IoBCHDh3KM1S/ffv22rlzp3OovoeHh9URAdgQxT4AAC6mW7du+vjjj5WQkKB77rmHGfyBSmrjxo0aMGCAWrZsqVWrVmnMmDE6fvy4Vq5cqcjISKvjAbA5in0AAFzUddddp/nz5+vnn3/W6NGjmcEfqAQuXryoJUuWKCIiQt27d3cO1T969KhiYmJUs2ZNqyMCqCIo9gEAcHH16tVTTEyMjh49qpdfflnvv/8+M/gDFezw4cPOofpjxoxR+/bt9dNPPzFUH4BlKPYBALCJwMBAjR07VocPH3bO4B8ZGal7771X33//vdXxAFvKHarfokULrVq1Sk899ZRzVv22bdtaHQ9AFUaxDwCAzfx2Bv8PP/xQZ86c0R/+8Ad169ZNcXFxzOAPXKPLly8777vv3r27Dh8+rLfffpuh+gAqFYp9AABsKncG/++//17ffvutgoODNXDgQGbwB0opd6h+w4YN9eijj+qGG27Qjh07tHXrVg0fPpyh+gAqFYp9AACqgNwZ/Hfu3Klbb71V0dHRzhn8z507Z3U8oFL77az6K1eu1FNPPeWcVf/666+3Oh4AFIhiHwCAKiQiIkJvvfWWjhw54pzBPyQkRGPHjtUvv/xidTyg0sgdqt+2bVvnUP3ly5crMTFRMTExqlWrltURAaBIFPsAAFRBv5/Bf+3atWrWrJmGDx+uffv2WR0PsMzPP/+siRMnqlGjRnr00UfVrl07huoDcEkU+wAAVGG5M/j//PPPWrp0qX744QdFREQwgz+qnN/Oqr9y5Uo9+eSTDNUH4NIo9gEAgHMG/z179ujDDz/U6dOnnTP4f/zxxzLGWB0RKHNXrlwpcKh+7qz6DNUH4Moo9gEAgFPuDP6bNm1yzuD/pz/9SS1btmQGf9hGUlKSYmJinLPqh4eHa9OmTc6h+p6enlZHBIBrRrEPAAAKlHtV/6efflJUVJSeffZZNW3alBn84bJyh+qHhobqrbfe0pNPPqljx44pNjZWUVFRVscDgDJFsQ8AAIoUGRmplStX6ujRoxo1apTmzZun0NBQZvCHS8gdqn/99dcXOKt+7dq1rY4IAOXCYbgJDwAAlEBqaqpWrFih2bNn6/Tp0xo4cKAmTZqk8PBwq6NVen/96181b948ZWdnO5edOnVKkvIUne7u7ho3bpxGjBhR0RFtIykpSUuWLNHChQuVmpqqP/3pT3r66afVpUsXq6MBQEWIo9gHAAClcuXKFa1Zs0YzZszQgQMHdM8992jSpEkUU0VISEgo9h9F9u3bxx9QSmHbtm2aP3++Vq9erVq1aunBBx/UU089pYYNG1odDQAqUhzD+AEAQKl4e3tr+PDh2rt3r3MG/65duzKDfxFatmypyMhIORyOQts4HA5FRkZS6JdA7lD9du3aqUOHDtq7d69zqP7MmTMp9AFUSRT7AADgmhQ2g//111+vlStXKjMzs1jrWbJkiTIyMso5rfWGDx8ud3f3Qn/u4eGhBx98sAITVR779u0rUfv//Oc/iomJUaNGjfTnP/9ZLVq00Hfffces+gAgin0AAFCGcq/q79ixQ+3atdPIkSMVEhKimJgYnT9/vtDX7d+/X4899ph69+5t+8f7DRkyJM89+7+XlZWlgQMHVmCiymH9+vXq2LGjNm7ceNW227Zt0/DhwxUSEqLFixfrkUce0eHDhxUbG6uuXbtWQFoAqPwo9gEAQJlr27atVq5cqQMHDmjAgAF69dVXFRISorFjxyopKSlf+zlz5sjNzU3r169Xr169lJaWZkHqitGgQQN17dpVbm75v4a5ubmpa9euatSokQXJrBMbG6t77rlHaWlpmjt3boFtrly5ori4OHXp0sU5VH/hwoU6cuQIQ/UBoAAU+wAAoNyEhoZq/vz5+uWXXzR16lTFxcWpWbNmGj58uOLj4yVJJ0+e1N/+9jdlZWUpKytL33zzjW6//XZduHDB4vTl54EHHijwvn2Hw6Hhw4dbkMg6b775pgYNGqTs7GwZY7Ru3TodO3bM+fPcofqNGzfWsGHD1LhxY23cuFFbt27Vo48+Kh8fHwvTA0DlxWz8AACgwhQ0g3+NGjW0evXqPPf2e3p6KiIiQl9++aVq1KhhYeLycfbsWdWtW1dZWVl5lru7u+vkyZOqWbOmRckq1qxZszRx4sQ8yzw9PTVhwgT17dtX8+fP13vvvacaNWpoxIgRevLJJ6vcqAcAKCUevQcAACpeTk6OPv30U02dOlU7duzIV/RKv05U16pVK/3rX/+yZfF7zz33aP369c737u7urp49e+rzzz+3OFn5M8Zo/PjxmjdvXoFPbfDz81NaWpo6deqkMWPGqH///vLy8rIgKQC4LB69BwAAKl7uDP4DBw4s9BF9WVlZio+P180336xTp05VcMLyN2zYMOXk5Dj/b4zRAw88YGGiipGRkaFBgwZp/vz5hR77K1eu6IUXXtCWLVs0dOhQCn0AKAWu7AMAAEtkZmYqJCREJ06cKLKdp6enQkJC9M0336hBgwYVlK78Xbp0SbVq1XI+fcDb21unT59WQECAxcnKz6VLl9SnTx999dVXRT6RwOFwqEWLFtq3b1+BcxsAAK6KK/sAAMAa77zzjpKTk6/aLjMzU4mJierWrZuOHz9eAckqhr+/v+677z55enrKw8NDvXv3tnWhn5KSoltvvVVff/11kYW+9Osoh4SEBH3zzTcVlA4A7IdiHwAAVDhjjGbPnl3oMO7fy8zM1PHjx9W1a1cdPXq0nNNVnKFDhyorK0vZ2dkaMmSI1XHKzfHjxxUVFVXo/AwFcXNz07x588o5GQDYl4fVAQAAVVNsbKzVEWChCxcuKCIiQo0aNdLZs2d19uxZpaSk6MKFC3nuY3dzc5O7u7ukX+/hP3bsmNq1a6eXXnpJ9erVsyp+mcnOzpaPj4+MMbp48aIt+0VSUpKmTp2qlJQUSb8O0Xd3d5fD4ZAxRjk5OXmOeS53d3dt3LhRy5YtU2BgYEXHhoUGDBhgdQTAFrhnHwBgCe7DBQAUhPIEKBNxXNkHAFhmzZo1XMFBqWVlZcnDw/W/yvzrX/+Sw+FQjx49rI4CWCo2NlYDBw60OgZgG67/GxIAAFRJdij0Jenmm2+2OgIAwIbs8VsSAADARbm5MV8yAKDs8dsFAAAAAACbodgHAAAAAMBmKPYBAAAAALAZin0AAAAAAGyGYh8AAAAAAJuh2AcAAAAAwGYo9gEAAAAAsBmKfQAAAAAAbIZiHwAAAAAAm6HYBwAAAADAZij2AQAAAACwGYp9AAAAAABshmIfAGBrr7zyioKCguRwOLRjxw6r4xTbQw89JB8fHzkcDl2+fNnqOBXms88+U1BQkD7++OMyaVdeXn31VdWpU0cOh0OLFy+2JENBOnbsKHd3d7Vr167M1z1y5EhVq1btqn2psHZWH7OCJCQk6KmnnlKbNm1UrVo1eXh4KCgoSC1atFCvXr20adMmqyMCQKlR7AMAbO3555/XW2+9ZXWMEluxYoUmTJhgdYwKZ4wp03blZcKECfr+++8tzVCQH374Qbfccku5rHvZsmVaunRpqdtZfcx+b/ny5YqMjNTOnTv12muv6dixY7p06ZK2b9+ul19+WefOndOuXbusjgkApeZhdQAAAIojPT1dt912W6UssFB2evXqpfPnz+dZVtCxL6gd/svhcFgdIZ/KdMw2b96sUaNG6eabb9Y//vEPeXj89ytxs2bN1KxZM1WvXl0HDhywMGXRrPxM5PMYcA0U+wAAl7B8+XIlJydbHcMSlbFwq0hV+diXlqenZ7mst7jnYkWcs8YYrV27VikpKXr00UdL9Npp06YpOztbr7zySp5C/7fuvPNO3XnnnWURtVxY2S/ok4BrYBg/AKDSGzdunMaPH69Dhw7J4XCoefPmkn79sv/aa6+pVatW8vb2VnBwsHr37q34+Pgi13fy5Ek1adJEHh4euuuuu5zLs7OzNWXKFIWEhMjX11dt27bVmjVrJEmLFi2Sv7+//Pz8tG7dOt19990KDAxUo0aNtHr16lK/t1WrVqlDhw7y8fGRv7+/mjRpopdfftn5czc3N3366ae6++67FRQUpPr16+vtt9/Os45vv/1WrVu3VlBQkHx8fBQZGal//OMfkqTZs2fLz89P1apVU3JyssaPH6+GDRsqISGhWPlef/11+fj4qE6dOho9erTq168vHx8fde3aVVu2bMnTtrjHY8OGDerUqZP8/PwUGBioyMhIpaamauPGjQoJCZHD4dCCBQskFXzsC2pX3O2X5DgWtV+vVVHn2rx58+Tv7y83NzfdeOONqlu3rjw9PeXv76/27dure/fuaty4sXx8fFS9enU999xz+dZ/8OBBhYeHy9/fX76+vurevbs2btxY7Ay5+3POnDlq2bKlvL29FRQUpGeffTbftorTrqBjVpJjkZ2drRkzZqhly5by9fVVrVq11LRpU82YMUMDBgxwtvviiy8UGBio6dOnF7rvMzIy9NVXX6lmzZrq1KlToe0Kep9leX5JRff/os6/wj4Ty+ozrKy3DcAiBgAAC0gya9asKXb7vn37mrCwsDzLpkyZYry8vMyqVavMuXPnzM6dO0379u1NrVq1zIkTJ5ztVq9ebSSZ7du3G2OMycjIMH379jXr1q3Ls74JEyYYb29vs3btWpOSkmImTZpk3NzczA8//GCMMWby5MlGkvnqq6/M+fPnTXJysunevbvx9/c3GRkZJd4Hc+fONZLMK6+8Ys6cOWPOnj1r3nrrLTN06NB82zt37pw5e/asueeee4y3t7e5dOmScz1xcXEmJibGnD171pw5c8ZERUWZmjVrOn+eu56xY8eaN954w/Tp08fs27ev2DlHjRpl/P39zd69e83ly5fNnj17TMeOHU21atVMYmKis11xjsfFixdNYGCgmTVrlklPTzcnTpwwffr0MadOnTLGGHPs2DEjybzxxhvO9RZ07AtqV9zzobjH8Wr79cCBA0aSefPNN4u9L3Nd7Vz7n//5HyPJbNmyxVy6dMmcPn3a3HXXXUaS+fTTT82pU6fMpUuXzJgxY4wks2PHDue6b7vtNtOsWTPz888/m8zMTLN7927TuXNn4+PjY/bv31/sDJMnTzYOh8P85S9/MSkpKSYtLc0sXLgwT18qSbuCjllxj8X06dONu7u7WbdunUlLSzPbtm0zdevWNT169MizXz/55BNTrVo1M3Xq1EL3/f79+40kExUVVaJjVtbn19X6/9XOv4L6RVl9hpXHtotjzZo1hvIEKDOx9CYAgCWutdhPS0szAQEBZtCgQXna/e///q+RlOfL/m+L/czMTDN48GDz+eef53ldenq68fPzy7O+tLQ04+3tbR5//HFjzH+/KKenpzvb5BY1Bw8eLPZ7MebXPzhUr17d3HLLLXmWZ2VlmXnz5hW6vZUrVxpJZvfu3YWue8aMGUaSSU5OLnQ9JTFq1CgTFBSUZ9kPP/xgJJmXXnrJGFP847F7924jyXzyyScFbqu0xX5JzofSHsff79fSFvvFOddyi/0LFy442/ztb38zksyuXbvyvb/33nvPuey2224z119/fZ5t7ty500gyEyZMKFaGtLQ04+fnZ3r27JlnPb//w1lx2xlTdLF/tWPRsWNH06lTpzzbePTRR42bm5u5cuWKKYmtW7caSeb2228v9mvK+vwqTv//vd+ff7/vF+X5GVYW2y4Oin2gTMUyjB8A4JL27NmjixcvqkOHDnmWd+zYUV5eXvmGmEu/DjMdMmSI6tSpk2f4vvTrI7jS0tIUERHhXObr66t69eoVeVuAl5eXJCkzM7NE+Xfu3Klz587luyfY3d1dY8eOLfR1ufdiF7W93DbZ2dklylQSHTp0kJ+fn3PfFPd4NGvWTHXq1NGwYcMUExOjI0eOlEme0pwPv1Wc41hW+/Vaz7WsrKx8ma52/kVGRiooKEg7d+4sVoaDBw8qLS1Nt912W5HrLW67kijoWFy+fDnfbP7Z2dny9PSUu7t7idYfEBAgSUpLSyv2a8r6/CpN/7/a+Veen2HltW0A5YtiHwDgks6dOyfpv1/cf6t69eq6cOFCvuVPPvmkDhw4oMWLF2vv3r15fnbp0iVJ0gsvvCCHw+H8d/To0RIVBcWVmprqzHqtPv30U/Xo0UO1a9eWt7d3gfdwlwdvb2+dOnVKUvGPh6+vr77++mt169ZN06dPV7NmzTRo0CClp6dfU5bSnA9XU177taLPtVyenp7Ogu5qGY4fPy5Jql27dpHrLG67a3XPPfdo27ZtWrdundLT07V161Z9+OGH+uMf/1jiYr9Jkyby8fHR/v37i/2asj6/itP/S3r+leV5ZeW2AZQdin0AgEvK/ZJc0Jfsc+fOqVGjRvmWDxgwQOvXr1f16tU1fPjwPFdIc4uVuXPnyhiT59+mTZvKPH+DBg0kSadPn76m9SQmJur+++9XvXr1tGXLFp0/f16zZs0qi4hFyszMzLOfS3I82rRpo48//lhJSUmKjo7WmjVr9Oqrr15TntKcD0Upz/1a0eea9OtogLNnzyokJKRYGXx8fCRJV65cKXK9xW13rWJiYnTrrbdqxIgRCgwMVJ8+fTRgwAAtXbq0xOvy9vbWnXfeqdOnT+u7774rtN3Zs2c1cuRISWV/fl2t/5fm/Cur88rKbQMoWxT7AACXFBERoYCAAG3dujXP8i1btigjI0M33nhjvtfccsstqlWrlpYsWaJt27Zp2rRpzp/lzm6+Y8eOcs8u/Xp1sUaNGvrnP/95TevZtWuXMjMz9fjjj6tZs2by8fGpkMee/fvf/5YxRlFRUZKKfzySkpKcoypq166tV155Re3bt8830qKkSnM+FKU892tFn2uS9K9//Us5OTlq3759sTJERETIzc1NGzZsKHK9xW13rfbs2aNDhw7p1KlTyszMVGJiohYtWqTg4OBSrS8mJkbe3t565plnCh1Vsnv3budj+cr6/Lpa/y/N+VdW55WV2wZQtij2AQAuoUaNGkpKStKRI0d04cIFubu7a/z48frggw/0zjvvKDU1Vbt27dJjjz2m+vXra9SoUYWu67777tOIESM0ffp0bdu2TdKvVygfeughrV69WosWLVJqaqqys7N1/Phx/ec//ynz9+Pt7a1Jkybpm2++0ZgxY/TLL78oJydHFy5cKFHhm3ul9ssvv9Tly5d14MCBq94/XBo5OTlKSUlRVlaWdu7cqXHjxikkJEQjRoyQ9Ov+K87xSEpK0ujRoxUfH6+MjAxt375dR48edf7RoCC/P/YF3Vtc3O0XV3nu14o41zIyMnT+/HllZWXpxx9/1JgxYxQaGprneBWVoXbt2urbt6/Wrl2r5cuXKzU1VTt37tSSJUvybKe47a7Vk08+qZCQEF28eLHIdp9//vlVH70nSe3atdO7776r3bt3q3v37vrss890/vx5ZWZm6ueff9bSpUv1yCOPOO9VL+vz62r9vzjnX0GfiWVxXlm5bQBlrCKnAwQAIJdKOBv/jz/+aEJDQ42vr6/p1q2bOXHihMnJyTFz5swx1113nfH09DTBwcHm/vvvNwkJCc7Xvf/++yY4ONhIMk2aNDHJyckmNTXVNG7c2EgyAQEBZuXKlcYYY65cuWKio6NNSEiI8fDwMLVr1zZ9+/Y1e/bsMQsXLjR+fn5GkrnuuuvMoUOHzJIlS0xgYKCRZEJDQ/M81qy4FixYYCIjI42Pj4/x8fExN9xwg1m4cKGZNWuW8fX1zbO9d955x/leGjVq5JyRPzo62tSoUcNUr17d9O/f3yxYsMBIMmFhYebJJ590rqdx48Zm1apVJc44atQo4+npaRo2bGg8PDxMYGCg6d27tzl06FCedsU5HkeOHDFdu3Y1wcHBxt3d3TRo0MBMnjzZZGVlmTfeeMPUq1fPSDJ+fn7mvvvuK/DYv/DCCwW2K872S3Ici9qv48aNM3Xr1jWSjL+/v+nTp0+J9mlR59q8efOcGZs0aWK+/fZbM3PmTBMUFGQkmbp165p3333XvPfee84MwcHBZvXq1cYYY1asWGFuueUWU6dOHePh4WFq1qxpBg8ebI4ePVrsDMYYc+HCBTNy5EhTs2ZNExAQYLp162amTJniPP9++umnYrcr6NiW5Fh8/fXXpmbNmkaS85+np6dp1aqVef/9953v6bPPPjPVqlUz06ZNK9ZxSExMNBMmTDCRkZEmICDAuLu7m+rVq5sbbrjBPPLII+a7775zti3r88uYwvu/MUWff4mJiQV+JpbVZ1hZb7u4mI0fKFOxDmN+N7UpAAAVwOFwaM2aNRowYIDVUXAVo0ePVlxcnM6cOWN1FFRRixYt0oEDBzR37lznsoyMDE2cOFGLFi1SSkqKfH19LUyIshAbG6uBAwfme/ICgFKJ87A6AQAAqPzK8zF+QFFOnDihMWPG5Lsf3MvLSyEhIcrMzFRmZibFPgD8DvfsAwBQRuLj4/M8dqqwf4MGDSKnzbBPy4+vr688PT21fPlynTx5UpmZmUpKStKyZcs0ZcoUDRo0SIGBgVbHBIBKhyv7AACUkfDwcJcYflqSnJMmTdKKFSuUkZGhpk2bas6cOerXr185J3Q9rnLsXVFQUJD++c9/aurUqWrRooUuXbqkgIAAtWnTRjNnztSjjz5qdUQAqJQo9gEAQKFmzJihGTNmWB0DVVz37t21fv16q2MAgEthGD8AAAAAADZDsQ8AAAAAgM1Q7AMAAAAAYDMU+wAAAAAA2AzFPgAAAAAANkOxDwAAAACAzVDsAwAAAABgMxT7AAAAAADYDMU+AAAAAAA2Q7EPAAAAAIDNUOwDAAAAAGAzFPsAAAAAANgMxT4AAAAAADbjYXUAAEDVtWnTJqsjAAAqCX4nAGXLYYwxVocAAFQ9DofD6ggAgEqI8gQoE3Fc2QcAWIIvc8CvBgwYIEmKjY21OAkAwE64Zx8AAAAAAJuh2AcAAAAAwGYo9gEAAAAAsBmKfQAAAAAAbIZiHwAAAAAAm6HYBwAAAADAZij2AQAAAACwGYp9AAAAAABshmIfAAAAAACbodgHAAAAAMBmKPYBAAAAALAZin0AAAAAAGyGYh8AAAAAAJuh2AcAAAAAwGYo9gEAAAAAsBmKfQAAAAAAbIZiHwAAAAAAm6HYBwAAAADAZij2AQAAAACwGYp9AAAAAABshmIfAAAAAACbodgHAAAAAMBmKPYBAAAAALAZin0AAAAAAGyGYh8AAAAAAJuh2AcAAAAAwGYo9gEAAAAAsBmKfQAAAAAAbIZiHwAAAAAAm6HYBwAAAADAZij2AQAAAACwGYp9AAAAAABshmIfAAAAAACb8bA6AAAAQFWxYcMGbd68Oc+y+Ph4SdKsWbPyLI+KitLNN99cYdkAAPbiMMYYq0MAAABUBevXr9cdd9whT09PubkVPMAyJydHmZmZ+uc//6mePXtWcEIAgE3EUewDAABUkOzsbNWtW1dnzpwpsl1wcLCSk5Pl4cEgTABAqcRxzz4AAEAFcXd319ChQ+Xl5VVoGy8vLz3wwAMU+gCAa0KxDwAAUIEGDx6sjIyMQn+ekZGhwYMHV2AiAIAdMYwfAACggoWGhioxMbHAnzVq1EiJiYlyOBwVnAoAYCMM4wcAAKhow4YNk6enZ77lXl5eevDBByn0AQDXjGIfAACggg0bNkyZmZn5lmdkZGjQoEEWJAIA2A3FPgAAQAVr1aqVWrVqlW95eHi4IiIiLEgEALAbin0AAAALDB8+PM9Qfk9PTz344IMWJgIA2AkT9AEAAFggMTFRTZo0Ue5XMYfDocOHD6tJkybWBgMA2AET9AEAAFghJCREHTp0kJubmxwOhzp27EihDwAoMxT7AAAAFhk+fLjc3Nzk7u6uBx54wOo4AAAbYRg/AACARU6dOqX69etLkn755RfVrVvX4kQAAJuI87A6AQAAqNz69++vtWvXWh3D9urVq2d1BNvq16+f4uLirI4BABWKYh8AAFxVVFSUnn76aatj2NKGDRvkcDh00003WR3FlubOnWt1BACwBMU+AAC4qkaNGmnAgAFWx7Clu+66S5IUGBhocRJ74oo+gKqKYh8AAMBCFPkAgPLAbPwAAAAAANgMxT4AAAAAADZDsQ8AAAAAgM1Q7AMAAAAAYDMU+wAAAAAA2AzFPgAAAAAANkOxDwAAAACAzVDsAwAAAABgMxT7AAAAAADYDMU+AAAAAAA2Q7EPAAAAAIDNUOwDAAAAAGAzFPsAAAAAANgMxT4AAMA1ePXVV1WnTh05HA4tXrzY6jgl8v7776tZs2ZyOBx5/nl5ealOnTrq0aOH5syZo5SUFKujAgBKiGIfAADgGkyYMEHff/+91TFKpW/fvjp8+LDCwsIUFBQkY4xycnKUnJys2NhYNW3aVNHR0WrTpo22bt1qdVwAQAlQ7AMAgEovPT1dXbt2dfltuAKHw6Hq1aurR48eWrFihWJjY3Xy5En16tVL58+ftzoeAKCYKPYBAEClt3z5ciUnJ7v8NlxRv379NGLECCUnJ7vcbQoAUJVR7AMAgDJnjNFrr72mVq1aydvbW8HBwerdu7fi4+OdbcaMGSMvLy/Vq1fPueyJJ56Qv7+/HA6HTp8+LUkaN26cxo8fr0OHDsnhcKh58+Z6/fXX5ePjozp16mj06NGqX7++fHx81LVrV23ZsqVMtnGtvv32W7Vu3VpBQUHy8fFRZGSk/vGPf0iSRo4c6bw/PiwsTNu3b5ckPfTQQ/Lz81NQUJA++ugjSVJ2dramTJmikJAQ+fpA8dmDAAAJRUlEQVT6qm3btlqzZo0kafbs2fLz81O1atWUnJys8ePHq2HDhkpISNAXX3yhwMBATZ8+/Zrfy4gRIyRJn3/+uXNZUbkWLVokf39/+fn5ad26dbr77rsVGBioRo0aafXq1XnWvWHDBnXq1El+fn4KDAxUZGSkUlNTr7oNAMBVGAAAgCL069fP9OvXr0SvmTJlivHy8jKrVq0y586dMzt37jTt27c3tWrVMidOnHC2Gzp0qKlbt26e186ZM8dIMqdOnXIu69u3rwkLC8vTbtSoUcbf39/s3fv/27u3mKjONYzjz8gAAzrjQDtqq0DAQwjaJo2HoLWJ9q4aG0FMSeOFhzSCaaOJbUk9ROMhSmz1QkVjmpqoidrBxlhPMdr0zm00ajRYgWiAtkjEURxAEJR3X3TvSadqVRxO4/+XzAXrW6z3+TJz82atb33XrLW11crKymz8+PHmdrutpqYmIjVeVGVlpUmyHTt2hI75/X5bvXq13b171wKBgGVnZ9sbb7wRVi8mJsb+/PPPsGt9+umnduTIkdDfX375pcXHx1tpaandu3fPli1bZv369bPz58+bmdny5ctNki1evNi2bt1qubm59ttvv9nRo0fN7XbbmjVrnpt/+PDhNnDgwGeOB4NBk2QpKSkvnevMmTN2//59u337tn3wwQfWv39/a2trMzOzpqYm83g8VlxcbC0tLVZXV2e5ubmh7+V5NV5EZ36/ABAFfuTOPgAAiKiWlhZt3rxZubm5mjNnjgYOHKh33nlHO3fu1J07d7Rr166I1XI6naGnB7KyslRSUqLGxkbt3r07YjU6Ky8vT6tWrVJSUpKSk5P18ccfKxAIqL6+XpJUWFiox48fh2UNBoM6f/68pk2bJklqbW1VSUmJcnJyNGvWLHm9Xq1YsUKxsbFPzHHjxo36/PPPdejQIWVmZmr69OkKBoNauXLlK8/F7XbL4XCosbHxpXNNmjRJHo9HPp9P+fn5am5uVk1NjSSpqqpKwWBQo0ePlsvl0uDBg3Xo0CG9+eabL1UDAPAkmn0AABBRZWVlampq0rhx48KOjx8/XnFxcWGP2UfauHHjlJiYGLZcoLeIjY2V9Nej6ZL04YcfatSoUfrhhx9kZpKkAwcOKD8/XzExMZKk8vJyPXjwQGPGjAldJyEhQUOGDOnWOTY3N8vM5PF4XilXXFycJKm9vV2SlJGRoUGDBmnOnDlavXq1qqqqQuf2lrkDQF9Fsw8AACKqoaFBkjRgwIAnxrxeb+jucFeJj48P3T3vSceOHdOUKVPk8/kUHx+vr7/+Omzc4XCooKBAN2/e1JkzZyRJe/bs0YIFC0LnNDc3S5JWrFgRWuPvcDhUXV2tBw8edNtcKioqJEmZmZkRzZWQkKBffvlFkydP1vr165WRkaH8/Hy1tLT0mrkDQF9Fsw8AACLK6/VK0lOb+oaGBg0bNqzLare3t3d5jRdRU1OjnJwcDRkyROfOndP9+/dVXFz8xHlz586Vy+XS999/r/Lycnk8HqWlpYXGfT6fJGnLli0ys7DP2bNnu20+J0+elCR99NFHEc81evRo/fzzz6qtrVVRUZEOHjyob7/9ttfMHQD6KmdPBwAAANFlzJgxGjBggC5cuBB2/Ny5c2pra9PYsWNDx5xOZ+iR7kj49ddfZWbKzs7ushov4urVq2pvb9eiRYuUkZEh6a87+f+UlJSkTz75RAcOHJDb7dZnn30WNp6SkiKXy6XLly93S+6nqaur05YtWzRs2DDNnz8/orlqa2vV0NCgrKws+Xw+bdiwQadOndK1a9d6xdwBoC/jzj4AAIgol8ulpUuX6qefftK+ffsUDAZ19epVFRYW6q233tLChQtD544YMUJ3797V4cOH1d7ervr6elVXVz9xzeTkZNXW1qqqqkqNjY2h5r2jo0P37t3To0ePdOXKFS1ZskSpqamhreIiUaMzUlNTJUmnT59Wa2urKisrn/mugsLCQj18+FBHjx7VjBkzwsZcLpfmzZun/fv3q6SkRMFgUI8fP9Yff/yhW7du/WuGEydOvNTWe2ampqYmdXR0yMxUX1+vgwcP6v3331dMTIwOHz4cWrP/Krn+rra2VgUFBbp+/bra2tp06dIlVVdXKzs7O2I1AOC11e0bAAAAgD6lM1uXdXR02KZNm2zkyJEWGxtrSUlJlpOTY+Xl5WHnBQIBmzp1qrlcLktPT7cvvvjCvvrqK5NkI0aMCG2hd/HiRUtLS7OEhASbPHmy1dXV2cKFCy02NtaGDh1qTqfTPB6PzZw5027cuBGxGi/iu+++s8GDB5sk69+/v+Xm5pqZWVFRkSUnJ5vX67XZs2fbtm3bTJINHz48bGtAM7P33nvPvvnmm6de/+HDh1ZUVGSpqanmdDrN5/PZrFmzrKyszIqLiy0hISG0Ld7evXtD/3f8+HFzu922bt26Z2Y/cuSIvfvuu5aYmGhxcXHWr18/k2QOh8O8Xq9NmDDB1qxZY4FA4KVybd++3RITE02SjRw50m7cuGG7du0yj8djkiwtLc0qKiqsqqrKJk2aZElJSRYTE2Nvv/22LV++3B49evTcGi+KrfcAvKZ+dJj97/WvAAAATzF79mxJkt/v7+Ek4QoKCuT3+xUIBHo6yiubPn26tm3bpvT09J6OEnV66+8XALqYn8f4AQBAn/X/bez6mr8vEbhy5YpcLheNPgAgomj2AQAA/uH69eth270965Ofn9+p6xcVFamyslIVFRWaN2+e1q5dG+EZAABed7yNHwAA9DnLli3T7t271dbWpvT0dG3atEl5eXkRu35mZqa6cqVjYmKiMjMzNXToUG3fvl1ZWVldVgsA8HpizT4AAPhXrHlGX8bvF8BrijX7AAAAAABEG5p9AAAAAACiDM0+AAAAAABRhmYfAAAAAIAoQ7MPAAAAAECUodkHAAAAACDK0OwDAAAAABBlaPYBAAAAAIgyNPsAAAAAAEQZmn0AAAAAAKIMzT4AAAAAAFGGZh8AAAAAgChDsw8AAAAAQJRx9nQAAADQ+5WWlsrhcPR0DKBT8vLyejoCAHQ7h5lZT4cAAAC919mzZ/X777/3dAyg01JSUjRx4sSejgEA3clPsw8AAAAAQHTxs2YfAAAAAIAoQ7MPAAAAAECUodkHAAAAACDKOCX5ezoEAAAAAACImP/8F95OIzOkO3FoAAAAAElFTkSuQmCC\n",
            "text/plain": [
              "<IPython.core.display.Image object>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 147
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "G8B73w06Wxxm"
      },
      "source": [
        "Visualizing the model makes it much easier to understand.\n",
        "\n",
        "Essentially what we're doing is trying to encode as much information about our sequences as possible into various embeddings (the inputs to our model) so our model has the best chance to figure out what label belongs to a sequence (the outputs of our model).\n",
        "\n",
        "You'll notice our model is looking very similar to the model shown in Figure 1 of [*Neural Networks for Joint Sentence Classification\n",
        "in Medical Paper Abstracts*](https://arxiv.org/pdf/1612.05251.pdf). However, a few differences still remain:\n",
        "* We're using pretrained TensorFlow Hub token embeddings instead of GloVe emebddings.\n",
        "* We're using a Dense layer on top of our token-character hybrid embeddings instead of a bi-LSTM layer.\n",
        "* Section 3.1.3 of the paper mentions a label sequence optimization layer (which helps to make sure sequence labels come out in a respectable order) but it isn't shown in Figure 1. To makeup for the lack of this layer in our model, we've created the positional embeddings layers.\n",
        "* Section 4.2 of the paper mentions the token and character embeddings are updated during training, our pretrained TensorFlow Hub embeddings remain frozen.\n",
        "* The paper uses the [`SGD`](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/SGD) optimizer, we're going to stick with [`Adam`](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Adam).\n",
        "\n",
        "All of the differences above are potential extensions of this project."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Ud8arQOTUtRl",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "7089ed58-456a-4855-fce7-a2e4a3f25a88"
      },
      "source": [
        "# Check which layers of our model are trainable or not\n",
        "for layer in model_5.layers:\n",
        "  print(layer, layer.trainable)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "<tensorflow.python.keras.engine.input_layer.InputLayer object at 0x7fe7337ba150> True\n",
            "<tensorflow.python.keras.layers.preprocessing.text_vectorization.TextVectorization object at 0x7fe7d4286310> True\n",
            "<tensorflow.python.keras.engine.input_layer.InputLayer object at 0x7fe7387b7490> True\n",
            "<tensorflow.python.keras.layers.embeddings.Embedding object at 0x7fe7d4286790> True\n",
            "<tensorflow_hub.keras_layer.KerasLayer object at 0x7fe86c77c910> False\n",
            "<tensorflow.python.keras.layers.wrappers.Bidirectional object at 0x7fe73870b150> True\n",
            "<tensorflow.python.keras.layers.merge.Concatenate object at 0x7fe733479950> True\n",
            "<tensorflow.python.keras.engine.input_layer.InputLayer object at 0x7fe7334f2c90> True\n",
            "<tensorflow.python.keras.engine.input_layer.InputLayer object at 0x7fe733237fd0> True\n",
            "<tensorflow.python.keras.layers.core.Dense object at 0x7fe733263e10> True\n",
            "<tensorflow.python.keras.layers.core.Dense object at 0x7fe7337e0b90> True\n",
            "<tensorflow.python.keras.layers.core.Dense object at 0x7fe733255690> True\n",
            "<tensorflow.python.keras.layers.core.Dropout object at 0x7fe7332686d0> True\n",
            "<tensorflow.python.keras.layers.merge.Concatenate object at 0x7fe73376ee10> True\n",
            "<tensorflow.python.keras.layers.core.Dense object at 0x7fe73c764ed0> True\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "RqUCaJPKY9o_"
      },
      "source": [
        "Now our model is constructed, let's compile it.\n",
        "\n",
        "This time, we're going to introduce a new parameter to our loss function called `label_smoothing`. Label smoothing helps to regularize our model (prevent overfitting) by making sure it doesn't get too focused on applying one particular label to a sample.\n",
        "\n",
        "For example, instead of having an output prediction of: \n",
        "* `[0.0, 0.0, 1.0, 0.0, 0.0]` for a sample (the model is very confident the right label is index 2).\n",
        "\n",
        "It's predictions will get smoothed to be something like:\n",
        "* `[0.01, 0.01, 0.096, 0.01, 0.01]` giving a small activation to each of the other labels, in turn, hopefully improving generalization.\n",
        "\n",
        "> 📖 **Resource:** For more on label smoothing, see the great blog post by PyImageSearch, [*Label smoothing with Keras, TensorFlow, and Deep Learning*](https://www.pyimagesearch.com/2019/12/30/label-smoothing-with-keras-tensorflow-and-deep-learning/)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "nwYd_dWPS8EB"
      },
      "source": [
        "# Compile token, char, positional embedding model\n",
        "model_5.compile(loss=tf.keras.losses.CategoricalCrossentropy(label_smoothing=0.2), # add label smoothing (examples which are really confident get smoothed a little)\n",
        "                optimizer=tf.keras.optimizers.Adam(),\n",
        "                metrics=[\"accuracy\"])"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vrXEGlcUZXAE"
      },
      "source": [
        "### Create tribrid embedding datasets and fit tribrid model\n",
        "\n",
        "Model compiled!\n",
        "\n",
        "Again, to keep our experiments swift, let's fit on 20,000 examples for 3 epochs.\n",
        "\n",
        "This time our model requires four feature inputs:\n",
        "1. Train line numbers one-hot tensor (`train_line_numbers_one_hot`)\n",
        "2. Train total lines one-hot tensor (`train_total_lines_one_hot`)\n",
        "3. Token-level sequences tensor (`train_sentences`)\n",
        "4. Char-level sequences tensor (`train_chars`)\n",
        "\n",
        "We can pass these as tuples to our `tf.data.Dataset.from_tensor_slices()` method to create appropriately shaped and batched `PrefetchedDataset`'s."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "8FDNHSIRyEE2",
        "outputId": "3dcfcbfc-7ae5-436c-f9e0-8c1f64b3602f"
      },
      "source": [
        "# Create training and validation datasets (all four kinds of inputs)\n",
        "train_pos_char_token_data = tf.data.Dataset.from_tensor_slices((train_line_numbers_one_hot, # line numbers\n",
        "                                                                train_total_lines_one_hot, # total lines\n",
        "                                                                train_sentences, # train tokens\n",
        "                                                                train_chars)) # train chars\n",
        "train_pos_char_token_labels = tf.data.Dataset.from_tensor_slices(train_labels_one_hot) # train labels\n",
        "train_pos_char_token_dataset = tf.data.Dataset.zip((train_pos_char_token_data, train_pos_char_token_labels)) # combine data and labels\n",
        "train_pos_char_token_dataset = train_pos_char_token_dataset.batch(32).prefetch(tf.data.AUTOTUNE) # turn into batches and prefetch appropriately\n",
        "\n",
        "# Validation dataset\n",
        "val_pos_char_token_data = tf.data.Dataset.from_tensor_slices((val_line_numbers_one_hot,\n",
        "                                                              val_total_lines_one_hot,\n",
        "                                                              val_sentences,\n",
        "                                                              val_chars))\n",
        "val_pos_char_token_labels = tf.data.Dataset.from_tensor_slices(val_labels_one_hot)\n",
        "val_pos_char_token_dataset = tf.data.Dataset.zip((val_pos_char_token_data, val_pos_char_token_labels))\n",
        "val_pos_char_token_dataset = val_pos_char_token_dataset.batch(32).prefetch(tf.data.AUTOTUNE) # turn into batches and prefetch appropriately\n",
        "\n",
        "# Check input shapes\n",
        "train_pos_char_token_dataset, val_pos_char_token_dataset"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(<PrefetchDataset shapes: (((None, 15), (None, 20), (None,), (None,)), (None, 5)), types: ((tf.float32, tf.float32, tf.string, tf.string), tf.float64)>,\n",
              " <PrefetchDataset shapes: (((None, 15), (None, 20), (None,), (None,)), (None, 5)), types: ((tf.float32, tf.float32, tf.string, tf.string), tf.float64)>)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 150
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "LiAjolB7yLxw",
        "outputId": "0bcb5c03-ce65-43b1-ba61-8a175191c41f"
      },
      "source": [
        "# Fit the token, char and positional embedding model\n",
        "history_model_5 = model_5.fit(train_pos_char_token_dataset,\n",
        "                              steps_per_epoch=int(0.1 * len(train_pos_char_token_dataset)),\n",
        "                              epochs=3,\n",
        "                              validation_data=val_pos_char_token_dataset,\n",
        "                              validation_steps=int(0.1 * len(val_pos_char_token_dataset)))"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Epoch 1/3\n",
            "562/562 [==============================] - 35s 49ms/step - loss: 1.2180 - accuracy: 0.6368 - val_loss: 0.9859 - val_accuracy: 0.8055\n",
            "Epoch 2/3\n",
            "562/562 [==============================] - 24s 43ms/step - loss: 0.9793 - accuracy: 0.8098 - val_loss: 0.9552 - val_accuracy: 0.8231\n",
            "Epoch 3/3\n",
            "562/562 [==============================] - 23s 41ms/step - loss: 0.9620 - accuracy: 0.8161 - val_loss: 0.9476 - val_accuracy: 0.8268\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "fS88IaN_auu8"
      },
      "source": [
        "Tribrid model trained! Time to make some predictions with it and evaluate them just as we've done before."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "V6AtA9ffcC8Y",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "aaf2d1ab-713c-4bf4-ddb7-99e7ff14e306"
      },
      "source": [
        "# Make predictions with token-char-positional hybrid model\n",
        "model_5_pred_probs = model_5.predict(val_pos_char_token_dataset, verbose=1)\n",
        "model_5_pred_probs"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "945/945 [==============================] - 21s 20ms/step\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "array([[0.50166875, 0.10549477, 0.01121361, 0.3581518 , 0.02347112],\n",
              "       [0.59197664, 0.08956539, 0.04119462, 0.2669149 , 0.01034847],\n",
              "       [0.2629166 , 0.12741803, 0.16365907, 0.3686114 , 0.077395  ],\n",
              "       ...,\n",
              "       [0.03510058, 0.09528926, 0.03802438, 0.03442893, 0.7971568 ],\n",
              "       [0.02894053, 0.33607832, 0.086002  , 0.02262262, 0.5263566 ],\n",
              "       [0.15117936, 0.56498915, 0.15235284, 0.03171322, 0.09976543]],\n",
              "      dtype=float32)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 152
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "l7x2LKrFc6CN",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "d20e77fe-1b52-4365-c07a-c02a0e578e71"
      },
      "source": [
        "# Turn prediction probabilities into prediction classes\n",
        "model_5_preds = tf.argmax(model_5_pred_probs, axis=1)\n",
        "model_5_preds"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<tf.Tensor: shape=(30212,), dtype=int64, numpy=array([0, 0, 3, ..., 4, 4, 1])>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 153
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "dogdVk02dO62",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "bc91a680-dfdf-466d-ced6-c0972f5cc6cc"
      },
      "source": [
        "# Calculate results of token-char-positional hybrid model\n",
        "model_5_results = calculate_results(y_true=val_labels_encoded,\n",
        "                                    y_pred=model_5_preds)\n",
        "model_5_results"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "{'accuracy': 82.83132530120481,\n",
              " 'f1': 0.8272937671199255,\n",
              " 'precision': 0.8268115620164092,\n",
              " 'recall': 0.8283132530120482}"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 154
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yranVE5soBdf"
      },
      "source": [
        "## Compare model results \n",
        "\n",
        "Far out, we've come a long way. From a baseline model to training a model containing three different kinds of embeddings.\n",
        "\n",
        "Now it's time to compare each model's performance against each other.\n",
        "\n",
        "We'll also be able to compare our model's to the [*PubMed 200k RCT:\n",
        "a Dataset for Sequential Sentence Classification in Medical Abstracts*](https://arxiv.org/pdf/1710.06071.pdf) paper.\n",
        "\n",
        "Since all of our model results are in dictionaries, let's combine them into a pandas DataFrame to visualize them."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "uJtoRSYGb2VP",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 225
        },
        "outputId": "f99735fa-b48d-4b36-895a-af4442e928fa"
      },
      "source": [
        "# Combine model results into a DataFrame\n",
        "all_model_results = pd.DataFrame({\"baseline\": baseline_results,\n",
        "                                  \"custom_token_embed_conv1d\": model_1_results,\n",
        "                                  \"pretrained_token_embed\": model_2_results,\n",
        "                                  \"custom_char_embed_conv1d\": model_3_results,\n",
        "                                  \"hybrid_char_token_embed\": model_4_results,\n",
        "                                  \"tribrid_pos_char_token_embed\": model_5_results})\n",
        "all_model_results = all_model_results.transpose()\n",
        "all_model_results"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>accuracy</th>\n",
              "      <th>precision</th>\n",
              "      <th>recall</th>\n",
              "      <th>f1</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>baseline</th>\n",
              "      <td>72.183238</td>\n",
              "      <td>0.718647</td>\n",
              "      <td>0.721832</td>\n",
              "      <td>0.698925</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>custom_token_embed_conv1d</th>\n",
              "      <td>78.717066</td>\n",
              "      <td>0.783722</td>\n",
              "      <td>0.787171</td>\n",
              "      <td>0.784680</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>pretrained_token_embed</th>\n",
              "      <td>71.395472</td>\n",
              "      <td>0.714320</td>\n",
              "      <td>0.713955</td>\n",
              "      <td>0.710898</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>custom_char_embed_conv1d</th>\n",
              "      <td>65.142990</td>\n",
              "      <td>0.646008</td>\n",
              "      <td>0.651430</td>\n",
              "      <td>0.638715</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>hybrid_char_token_embed</th>\n",
              "      <td>74.026877</td>\n",
              "      <td>0.740302</td>\n",
              "      <td>0.740269</td>\n",
              "      <td>0.738099</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>tribrid_pos_char_token_embed</th>\n",
              "      <td>82.831325</td>\n",
              "      <td>0.826812</td>\n",
              "      <td>0.828313</td>\n",
              "      <td>0.827294</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "                               accuracy  precision    recall        f1\n",
              "baseline                      72.183238   0.718647  0.721832  0.698925\n",
              "custom_token_embed_conv1d     78.717066   0.783722  0.787171  0.784680\n",
              "pretrained_token_embed        71.395472   0.714320  0.713955  0.710898\n",
              "custom_char_embed_conv1d      65.142990   0.646008  0.651430  0.638715\n",
              "hybrid_char_token_embed       74.026877   0.740302  0.740269  0.738099\n",
              "tribrid_pos_char_token_embed  82.831325   0.826812  0.828313  0.827294"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 156
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "9G--0tQkb5tq"
      },
      "source": [
        "# Reduce the accuracy to same scale as other metrics\n",
        "all_model_results[\"accuracy\"] = all_model_results[\"accuracy\"]/100"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "JHtN7qJ3cAA3",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 571
        },
        "outputId": "05a7f686-9028-4d91-c1b4-7034c24f4801"
      },
      "source": [
        "# Plot and compare all of the model results\n",
        "all_model_results.plot(kind=\"bar\", figsize=(10, 7)).legend(bbox_to_anchor=(1.0, 1.0));"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAqkAAAIqCAYAAAAHAtOxAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nOzde5QU9Z3//9druIgIEi8DooiAMgyDgODIeo3GS4Jr8J6Iumqy2bAxkmTNRc1m1xhz1cRkvybmF7y7iYYY4wUiCYmJwu5qIgMKyE0RCWq8oCKgBGHg/fuja0yDM0zP0D1V0/18nDNnuqo+0/2eOn2qX/2pqs/HESEAAAAgS6rSLgAAAADYHiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmdM1rRfee++9Y9CgQWm9PAAAQMHmzp37WkRUp11HJUktpA4aNEgNDQ1pvTwAAEDBbP8l7RoqDaf7AQAAkDmEVAAAAGQOIRUAAACZk9o1qQAAAJ3Z3Llz+3bt2vVmSQeLjr+22irpqcbGxn859NBDX22uASEVAACgHbp27XrzPvvsM7y6unpNVVVVpF1PZ7J161avXr267uWXX75Z0qnNtSH1AwAAtM/B1dXV6wiobVdVVRXV1dVrleuFbr5NB9YDAABQTqoIqO2X7LsWsyghFQAAAJnDNakAAABFMOiKBw8t5vOt/M4pc4v5fDtj8+bN6tatW4e+Jj2pAAAAndiJJ5544IgRI4YfdNBBI773ve/tLUn33HPP7nV1dcOHDRtWd8QRR9RI0tq1a6vOPvvsQTU1NXU1NTV1t99++/skqWfPnmOanuu2227b46yzzhokSWedddag8847b+CoUaNqL7744gEPP/xwz0MOOaR2+PDhdWPGjKmdP3/+LpLU2NioSZMmDRg6dOiImpqaum9+85t9p02b1vvEE088sOl577vvvt1POumkA9UG9KQCAAB0YnfeeefKfv36bXnrrbc8ZsyYunPOOefNyZMnD3rkkUeW1tbWbnrllVe6SNIVV1zRf/fdd9/y9NNPL5ak1atXd2ntuV966aXu8+bNW9q1a1e98cYbVXPmzFnarVs33X///b0vu+yyATNnznz2uuuuq161alX3xYsXL+rWrZteeeWVLtXV1Vs+97nPDfzrX//add9992289dZb9/r4xz/+Wlv+L0IqAABAJ3bNNdf0e/DBB98nSS+//HK366+/vnrcuHHra2trN0lSv379tkjS7Nmzd586deqKpr+rrq7e0tpzn3nmmWu6ds3FxTfeeKPLOeecM3jlypU9bMfmzZstSX/84x93/9SnPrW66XKAptf76Ec/+vpNN9205yWXXPL6vHnzet17773PteX/IqQCAAB0Ur/+9a97z5o1q3dDQ8PS3r17bx03btywMWPGbFi2bFmPQp/D9ruP//a3vzl/W69evbY2Pb788sv3O/bYY9f//ve/f3bZsmXdjz/++GE7et6LL7749VNOOeWgHj16xIQJE9a09ZpWrkkFAADopN58880uffr02dK7d++tTzzxRI/58+fvtnHjxqrHH3+899KlS7tLUtPp/mOPPXbdD37wg75Nf9t0un+vvfbaPG/evB5btmzRAw88sEdLr7Vu3bouAwYM2CRJU6ZM2btp/QknnLBuypQpe2/evFn5rzdo0KDN/fr123zdddf1nzRpUptO9UuEVAAAgE7rrLPOWtvY2OghQ4aM+NKXvrTf6NGj3+7bt2/j9ddfv/KMM844aNiwYXVnnHHGEEn69re//dKbb77ZZejQoSOGDRtWN2PGjN6S9LWvfe3F00477aCxY8fW9uvXb3NLr3X55Ze/fNVVVw0YPnx4XWNj47vrL7300tUDBgzYVFtbO2LYsGF1t9xyy55N2yZOnPh6//79N40dO3ZjW/83R6QzBm19fX00NDSk8toAAABtYXtuRNTnr5s/f/7K0aNHt7mHsJJceOGFA8eMGbPh0ksvbXY/zZ8/f+/Ro0cPam4b16QCAIB3DbriwTa1X/mdU9rUfuQdIwtue/e3G1tvlGf40iVtao/SGjFixPBdd91165QpU55vz98TUgEAQPtd1adt7QcPLE0dyJxFixbt1LcGrkkFAABA5hBSAQAAkDmEVAAAAGQOIRUAAACZQ0gFAADAu2bPnt3zYx/72P4tbV+5cmW38ePHDyl1HdzdDwAAUAxX9Tm0uM+3dm4xnqaxsVFduxYe+d7//vdveP/737+hpe2DBg3a/Nvf/nZFMWrbEXpSAQAAOqlly5Z1Hzx48IhTTz118JAhQ0aMHz9+yPr166v222+/kRdffPF+dXV1w2+99dY97r333t0POeSQ2rq6uuEnn3zykLVr11ZJ0qxZs3qOGTOmdtiwYXUjR44cvmbNmqpf//rXvT/wgQ8cJEkPPvhgr9ra2rra2tq64cOH161Zs6Zq2bJl3YcOHTpCkjZs2OCzzz57UE1NTd3w4cPrpk+f3luSrr/++r0++MEPHnjMMccMPeCAAw7+1Kc+NaCt/1tBIdX2eNvLbC+3fUUz2wfaftj2E7YX2P7HthYCAACAtlu5cmWPyZMnv7pixYpFvXv33vrd7363WpL22muvxsWLFy+ZMGHC+m9961v9Z8+e/fTixYuXjB07dsPXv/71fhs3bvT5559/4H/913+tWrZs2eJZs2Yt69Wr19b8577uuuv2uf766/+ydOnSxX/605+Wbr/9mmuu6WtbTz/99OK77rprxaRJkwZt2LDBkrR48eKe999//4olS5YsmjZt2h7Lly/v1pb/q9WQaruLpBsknSypTtK5tuu2a/Yfku6OiDGSJkr6cVuKAAAAQPvss88+mz74wQ++LUkXXHDB648++mgvSbrwwgvXSNIjjzyy27PPPttj3LhxtbW1tXVTp07da9WqVd0XLFjQo2/fvpuPPfbYDZK05557bu3Wbdscefjhh7/1xS9+cf9vfOMbfV977bUu229/9NFHe11wwQWvS9KYMWM27rvvvpsWLlzYQ5KOPvrodXvttdeWnj17xkEHHbTx2Wef3aUt/1chFyiMk7Q8IlZIku2pkk6TtDivTUjaPXncR9Jf21IEAAAA2sd2s8u9e/feKkkRoaOPPnrd9OnTn8tv9/jjj+/a2nN/61vfevn0009f+8ADD/Q55phjah988MFnevbsubW1v5Ok7t27R9PjLl26xObNm72j9tsr5HT/fpLy51x9IVmX7ypJ/2T7BUkzJH2muSeyPcl2g+2G1atXt6VOAAAANOOll17q/tBDD+0mSXfeeeeeRx555Fv524877ri3Gxoaej311FO7SNK6deuqFixYsMuoUaM2vvrqq91mzZrVU5LWrFlTtXnz5m2ee9GiRbuMGzfub9/85jdfHjVq1NtPPfVUj/ztRx111Fs/+9nP9pSkBQsW7PLSSy91HzVq1MZi/F/FunHqXEm3R8QASf8o6ae23/PcEXFjRNRHRH11dXWRXhoAAKByDRo0aOMPf/jDvkOGDBnx5ptvdv3iF7+4TU/gvvvu2zhlypSVEydOHFJTU1NXX19fu3Dhwh49evSIO++889nPfvazA4cNG1Z33HHH1WzYsGGb/Hbttdf2HTp06Iiampq6bt26xdlnn702f/tll1326tatW11TU1N3zjnnHDhlypSVu+66a6gIHLHj57F9hKSrIuJDyfKXJSkivp3XZpGk8RHxfLK8QtLhEfFqS89bX18fDQ0NO/8fAACAohl0xYNtar+yx3ltaj9y8MCC29797cY2PffwpUva1L4tbM+NiPr8dfPnz185evTo10r2ogVYtmxZ9w9/+MNDn3nmmUVp1tFe8+fP33v06NGDmttWSE/qHElDbQ+23V25G6OmbddmlaQTJMn2cEk9JHE+HwAAAO3SakiNiEZJkyXNlLREubv4F9m+2vapSbMvSPqk7fmSfi7pY9FaFy0AAAB2yrBhwzZ11l7U1hQ0/UBEzFDuhqj8dVfmPV4s6ajilgYAAIBKxbSoKIo2X8P0nVPa1H7kHSMLbrvwooVtem4AAJA9hFSUnSW1w9vUvpQX2gMAgPYhpCIdV/VpW/s23A0KAAA6v2KNkwoAAIAycP311+914YUXDpSkz3/+8/teeeWV/dKog55UAACAIhh5x8hDi/l8Cy9aOLct7bdu3aqIUJcuXYpZRmroSQUAAOikli1b1n3QoEEHn3HGGYNqampGXHbZZf0PPvjg4TU1NXWXXnrpvk3tfvSjH+1VU1NTN2zYsLrTTz99sCTdddddfUaNGlU7fPjwuiOPPLLm+eefz1TnZaaKAQAAQNusWrVql1tuueW5tWvXvvHLX/5yjwULFiyJCJ144okH/eY3v+lVXV3d+L3vfa//Y489trR///6Nr7zyShdJOumkk96aOHHi0qqqKn3/+9/f++qrr97npptueiHt/6cJIRUAAKAT69+//6YTTjjh7UmTJg2YPXv27nV1dXWStGHDhqqlS5f2mDdvXtWECRPW9O/fv1GS+vXrt0WSnnvuue6nn376gNWrV3fbtGlT1f777/9Omv/H9jjdDwAA0In17NlzqyRFhP7t3/7tpaVLly5eunTp4lWrVj116aWXvtbS302ePHngpz/96VeffvrpxT/60Y/+8s4772QqF2aqGAAAALTPySefvO6nP/3p3mvXrq2SpOeee67biy++2PVDH/rQuunTp+/x8ssvd5GkptP969ev7zJw4MDNknT77bfvlV7lzeN0PwAAQBk488wz1y1atKjHYYcdVivleljvvPPO5+rr6zd+4QtfeOmYY46praqqioMPPnjDr371q5Vf+cpX/nruuece2KdPn8ajjz56/apVq3ZJ+3/I54hI5YXr6+ujoaEhlddG8bV5WtQe57Wp/cg2DOZ/97cb2/TczDgFAH/H8bx5tudGRH3+uvnz568cPXp0i6fT0br58+fvPXr06EHNbeN0PwAAADKHkAoAAIDMIaQCAAAgcwipAAAAyBxCKgAAADKHkAoAAIDMIaQCAAB0Ut/4xjf6DhkyZMSHPvShAw855JDa7t27j73yyiv7pV1XMTCYPwAAQBEsqR1+aDGfb/jSJXNba3PLLbdUP/TQQ0/36NEjli9f3v2ee+7Zo5g1pImeVAAAgE7ovPPOG/jCCy/scvLJJw+9+eab9zz22GM3dOvWLZ1ZmkqAnlQAQGYx+xHQsrvuumvVrFmz+syaNevp/v37t+0N2gnQkwoAAIDMIaQCAAAgcwipAAAAyByuSQUAAOjkVq1a1fWwww6re/vtt7vYjilTpvRbsmTJU3vuuefWtGtrL0IqAABAERQyZFSxvfjiiwubHr/yyisLOvr1S4nT/QAAAMgcQioAAAAypyxP97d5XL3vnNKm9iPvGFlw24UXLWy9EQAAALZRliG1za7q07b2bRj8GQAAlK2tW7dudVVVVdnM8tSRtm7dakkt3tjF6X4AAID2eWr16tV9krCFNti6datXr17dR9JTLbWhJxXopLI0XSSXtQCoRI2Njf/y8ssv3/zyyy8fLDr+2mqrpKcaGxv/paUGhFQAO21J7fA2tWdOcwDl4NBDD31V0qlp11GuCkr9tsfbXmZ7ue0rmtn+A9tPJj9P236z+KUCAACgUrTak2q7i6QbJJ0k6QVJc2xPi4jFTW0i4tK89p+RNKYEtQIAAKBCFNKTOk7S8ohYERGbJE2VdNoO2p8r6efFKA4AAACVqZCQup+k5/OWX0jWvYftAyQNlvTHnS8NAAAAlarYd6JNlHRPRGxpbqPtSbYbbDesXr26yC8NAACAclHI3f0vSto/b3lAsq45EyVd0tITRcSNkm6UpPr6+ooY+Ja7ngEAANqukJ7UOZKG2h5su7tyQXTa9o1s10raQ9JjxS0RAAAAlabVkBoRjZImS5opaYmkuyNike2rbeePDTZR0tSIqIgeUgAAAJROQYP5R8QMSTO2W3fldstXFa8sAAAAVDKm8AIAAEDmEFIBAACQOYRUAAAAZA4hFQAAAJlDSAUAAEDmEFIBAACQOYRUAAAAZA4hFQAAAJlDSAUAAEDmEFIBAACQOYRUAAAAZA4hFQAAAJlDSAUAAEDmEFIBAACQOYRUAAAAZA4hFQAAAJnTNe0CAKCzGHTFg21qv/I7p7Sp/cg7RhbcduFFC9v03ADQ2dCTCgAAgMwhpAIAACBzON0PAKVyVZ+2tR88sDR1AEAnRE8qAAAAMoeQCgAAgMzhdD8AdEJLaoe3qf3wpUtKVAkAlAY9qQAAAMgcQioAAAAyh5AKAACAzCGkAgAAIHMIqQAAAMgcQioAAAAyh5AKAACAzCGkAgAAIHMIqQAAAMgcQioAAAAyh5AKAACAzCGkAgAAIHMKCqm2x9teZnu57StaaPNR24ttL7J9V3HLBAAAQCXp2loD210k3SDpJEkvSJpje1pELM5rM1TSlyUdFRFrbPctVcEAAAAof4X0pI6TtDwiVkTEJklTJZ22XZtPSrohItZIUkS8WtwyAQAAUEkKCan7SXo+b/mFZF2+Gkk1tv/P9p9sjy9WgQAAAKg8rZ7ub8PzDJV0nKQBkmbbHhkRb+Y3sj1J0iRJGjhwYJFeGgAAAOWmkJ7UFyXtn7c8IFmX7wVJ0yJic0Q8J+lp5ULrNiLixoioj4j66urq9tYMAACAMldISJ0jaajtwba7S5ooadp2be5XrhdVtvdW7vT/iiLWCQAAgArSakiNiEZJkyXNlLRE0t0Rscj21bZPTZrNlPS67cWSHpb0pYh4vVRFAwAAoLwVdE1qRMyQNGO7dVfmPQ5Jn09+AAAAgJ3CjFMAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwpKKTaHm97me3ltq9oZvvHbK+2/WTy8y/FLxUAAACVomtrDWx3kXSDpJMkvSBpju1pEbF4u6a/iIjJJagRAAAAFaaQntRxkpZHxIqI2CRpqqTTSlsWAAAAKlkhIXU/Sc/nLb+QrNveWbYX2L7H9v5FqQ4AAAAVqVg3Tk2XNCgiRkn6vaQ7mmtke5LtBtsNq1evLtJLAwAAoNwUElJflJTfMzogWfeuiHg9It5JFm+WdGhzTxQRN0ZEfUTUV1dXt6deAAAAVIBCQuocSUNtD7bdXdJESdPyG9jun7d4qqQlxSsRAAAAlabVu/sjotH2ZEkzJXWRdGtELLJ9taSGiJgm6bO2T5XUKOkNSR8rYc0AAAAoc62GVEmKiBmSZmy37sq8x1+W9OXilgYAAIBKxYxTAAAAyBxCKgAAADKHkAoAAIDMIaQCAAAgcwipAAAAyBxCKgAAADKHkAoAAIDMIaQCAAAgcwipAAAAyBxCKgAAADKHkAoAAIDMIaQCAAAgcwipAAAAyBxCKgAAADKHkAoAAIDMIaQCAAAgcwipAAAAyBxCKgAAADKHkAoAAIDMIaQCAAAgcwipAAAAyBxCKgAAADKHkAoAAIDMIaQCAAAgcwipAAAAyBxCKgAAADKHkAoAAIDMIaQCAAAgcwipAAAAyBxCKgAAADKHkAoAAIDMIaQCAAAgcwipAAAAyBxCKgAAADKHkAoAAIDMKSik2h5ve5nt5bav2EG7s2yH7frilQgAAIBK02pItd1F0g2STpZUJ+lc23XNtOst6XOS/lzsIgEAAFBZCulJHSdpeUSsiIhNkqZKOq2Zdl+XdI2kjUWsDwAAABWokJC6n6Tn85ZfSNa9y/ZYSftHxIM7eiLbk2w32G5YvXp1m4sFAABAZdjpG6dsV0n6vqQvtNY2Im6MiPqIqK+urt7ZlwYAAECZKiSkvihp/7zlAcm6Jr0lHSzpEdsrJR0uaRo3TwEAAKC9CgmpcyQNtT3YdndJEyVNa9oYEWsjYu+IGBQRgyT9SdKpEdFQkooBAABQ9loNqRHRKGmypJmSlki6OyIW2b7a9qmlLhAAAACVp2shjSJihqQZ2627soW2x+18WQAAAKhkzDgFAACAzCGkAgAAIHMIqQAAAMgcQioAAAAyh5AKAACAzCGkAgAAIHMIqQAAAMgcQioAAAAyh5AKAACAzCGkAgAAIHMIqQAAAMgcQioAAAAyh5AKAACAzCGkAgAAIHMIqQAAAMgcQioAAAAyh5AKAACAzCGkAgAAIHMIqQAAAMgcQioAAAAyh5AKAACAzCGkAgAAIHMIqQAAAMgcQioAAAAyh5AKAACAzCGkAgAAIHMIqQAAAMgcQioAAAAyh5AKAACAzCGkAgAAIHMIqQAAAMgcQioAAAAyh5AKAACAzCGkAgAAIHMKCqm2x9teZnu57Sua2f4p2wttP2n7f23XFb9UAAAAVIpWQ6rtLpJukHSypDpJ5zYTQu+KiJERcYikayV9v+iVAgAAoGIU0pM6TtLyiFgREZskTZV0Wn6DiFiXt7ibpCheiQAAAKg0XQtos5+k5/OWX5D0D9s3sn2JpM9L6i7p+KJUBwAAgIpUtBunIuKGiDhQ0uWS/qO5NrYn2W6w3bB69epivTQAAADKTCEh9UVJ++ctD0jWtWSqpNOb2xARN0ZEfUTUV1dXF14lAAAAKkohIXWOpKG2B9vuLmmipGn5DWwPzVs8RdIzxSsRAAAAlabVa1IjotH2ZEkzJXWRdGtELLJ9taSGiJgmabLtEyVtlrRG0kWlLBoAAADlrZAbpxQRMyTN2G7dlXmPP1fkugAAAFDBmHEKAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5BYVU2+NtL7O93PYVzWz/vO3FthfY/oPtA4pfKgAAACpFqyHVdhdJN0g6WVKdpHNt123X7AlJ9RExStI9kq4tdqEAAACoHIX0pI6TtDwiVkTEJklTJZ2W3yAiHo6IDcninyQNKG6ZAAAAqCSFhNT9JD2ft/xCsq4ln5D0m50pCgAAAJWtazGfzPY/SaqXdGwL2ydJmiRJAwcOLOZLAwAAoIwU0pP6oqT985YHJOu2YftESV+RdGpEvNPcE0XEjRFRHxH11dXV7akXAAAAFaCQkDpH0lDbg213lzRR0rT8BrbHSJqiXEB9tfhlAgAAoJK0GlIjolHSZEkzJS2RdHdELLJ9te1Tk2bfldRL0i9tP2l7WgtPBwAAALSqoGtSI2KGpBnbrbsy7/GJRa4LAAAAFYwZpwAAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmUNIBQAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmVNQSLU93vYy28ttX9HM9vfbnme70fbZxS8TAAAAlaTVkGq7i6QbJJ0sqU7Subbrtmu2StLHJN1V7AIBAABQeboW0GacpOURsUKSbE+VdJqkxU0NImJlsm1rCWoEAABAhSnkdP9+kp7PW34hWQcAAACURIfeOGV7ku0G2w2rV6/uyJcGAABAJ1JISH1R0v55ywOSdW0WETdGRH1E1FdXV7fnKQAAAFABCgmpcyQNtT3YdndJEyVNK21ZAAAAqGSthtSIaJQ0WdJMSUsk3R0Ri2xfbftUSbJ9mO0XJH1E0hTbi0pZNAAAAMpbIXf3KyJmSJqx3bor8x7PUe4yAAAAAGCnMeMUAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCKkAAADIHEIqAAAAMoeQCgAAgMwhpAIAACBzCgqptsfbXmZ7ue0rmtm+i+1fJNv/bHtQsQsFAABA5Wg1pNruIukGSSdLqpN0ru267Zp9QtKaiDhI0g8kXVPsQgEAAFA5CulJHSdpeUSsiIhNkqZKOm27NqdJuiN5fI+kE2y7eGUCAACgknQtoM1+kp7PW35B0j+01CYiGm2vlbSXpNfyG9meJGlSsviW7WXtKbrY2p6mn9pb2/1vLdm+y7n1Yioj27PPOx77vOOxzzse+7zjVdA+P6CUT473KiSkFk1E3Cjpxo58zVKw3RAR9WnXUUnY5x2Pfd7x2Ocdj33e8djnKFQhp/tflLR/3vKAZF2zbWx3ldRH0uvFKBAAAACVp5CQOkfSUNuDbXeXNFHStO3aTJN0UfL4bEl/jIgoXpkAAACoJK2e7k+uMZ0saaakLpJujYhFtq+W1BAR0yTdIumntpdLekO5IFvOOv0lC50Q+54LTY8AACAASURBVLzjsc87Hvu847HPOx77HAUxHZ4AAADIGmacAgAAQOYQUgEAAJA5hFQAAABkDiEVAAAAmdOhg/l3ZraPljQ0Im6zXS2pV0Q8l3Zd5cj253e0PSK+31G1VAr2ecezvVBSi3euRsSoDiynItg+c0fbI+LejqqlEnBcwc4ipBbA9lcl1UsaJuk2Sd0k/UzSUWnWVcZ6J7+HSTpMfx+Xd4Kkx1OpqPyxzzveh5PflyS/f5r8Pj+FWirFhOR3X0lHSvpjsvwBSY9KIqQWF8cV7BSGoCqA7ScljZE0LyLGJOsW0NNRWrZnSzolItYny70lPRgR70+3svLFPu94tp9oOq7krZsXEWPTqqnc2f6dpIsi4qVkub+k2yPiQ+lWVp44rqC9uCa1MJuSGbRCkmzvlnI9laKfpE15y5uSdSgd9nnHs+2j8haOFMfmUtu/KaAmXpE0MK1iKgDHFbQLp/sLc7ftKZLeZ/uTkv5Z0k0p11QJ/lvS47bvS5ZPl3R7euVUhOb2+R0p1lMJPiHpVtt9kuU3lTvGoHT+YHumpJ8ny+dIeijFesodxxW0C6f7C2T7JEkflGRJMyPi9ymXVBFsj5V0TLI4OyKeSLOeSsA+T0dTSI2ItWnXUglsnyGp6XTz7Ii4b0ftsXM4rqA9CKnoVGz3ioi30q6jnDGSRfpsfzwibku7jnJm+wDl3ucP2e4pqUvTNZMoPo4raA+ueyqA7TNtP2N7re11ttfbXpd2XRVqcdoFlLNkJIvLJX05WdU0kgU61tfSLqCcJZdt3SNpSrJqP0n3p1dReeO4gvbimtTCXCtpQkQsSbuQSrCDsfUsqVdH1lKBzlAykoUkRcRfkztxUWS2F7S0SdxUUmqXSBon6c+SFBHP2O6bbklljeMK2oWQWphXCKgd6luSviupsZlt9P6X1qaICNuMZFF6/SR9SNKa7dZbuTE7UTrvRMQm25Ik2121g4kVsNM4rqBdCKmFabD9C+VOB73TtJLZSUpmnqT7I2Lu9hts/0sK9VQSRrLoOL9W7rq8J7ffYPuRji+nosyy/e+Sdk1uiv20pOkp11TOOK6gXbhxqgC2m7uBISKCYWJKwPYwSa9HxGvNbOsXEa+kUFbFyBvJQpJ+x0gWKDe2q5Qb+uvdEVsk3Rx8IJYMxxW0ByEVmWV7bETMS7uOSmN7H+Wu1wtJcyLi5ZRLKmu2r5c0NSI4xd+BbHeXVKvc+3xZRGxq5U+wEziuoD0IqTtg+7KIuNb2D9XM9UoR8dkUyqoYth+WtI9yd+H+IiKeSrmkspdcTnGlcnOaW9Kxkq6OiFtTLayM2b5IucHkh0m6T7nA2pBuVeXN9imSfiLpWeXe54Ml/WtE/CbVwsoUxxW0FyF1B2xPiIjpyYfIe0QEM2aUWPLt+6PKfYjvrlxY/Ua6VZUv28skHRkRryfLe0l6NCKGpVtZ+bO9p6SzJE2UNDAihqZcUtmyvVTShyNiebJ8oHJzydemW1l54riC9uLGqR2IiOnJb8JoSpJTQtcnvaqXKfdtnJBaOq9Lyh/QfH2yDqV3kHKnnw+QxGgipbW+KaAmVmjb9z2Ki+MK2oWQugO2p2sHw5JExKkdWE7FsT1cuR7Us5Q7oP1C0hdSLapM5Y1Nu1zSn20/oNx7/zRJLY3niSKwfa1y40g+q9x7/OsR8Wa6VZUn22cmDxtsz5B0t3Lv849ImpNaYWWK4wp2FiF1x76XdgEV7lZJUyV9KCL+mnYxZa5pYO1nk58mD6RQS6V5VtIRzY1mgaKbkPf4FeWujZSk1ZJ27fhyyh7HFewUrkktkO1dlbtObFnatQAoL7b3U+40/7sdBxExO72KACB99KQWwPYE5XpVu0sabPsQ5e5M5HR/Cdk+StJV+vuHt5Ubn3ZImnWVM9v1kr6i9wamUakVVeZsf0e5m6UWS9qSrA5JhNQSsT1Y0mckDdK273OO6SXAcQXtRU9qAWzPlXS8pEciYkyybmFEjEy3svKW3IF7qaS5+vuHt5ruEEXxJXfhfknSQklbm9ZHxF9SK6rMJft8VES802pjFIXt+ZJu0Xvf57NSK6qMcVxBe9GTWpjNEbG2aZ7nBOm+9NYybmGHWx0R09IuosKskNRNeVMuo+Q2RsT1aRdRQTiuoF0IqYVZZPs8SV1sD5X0WUnMDlN6D9v+rqR7lfcBzixUJfVV2zdL+oO23ef3pldS2dsg6Unb2+9zJgspnf9n+6uSfieOLR2B4wrahZBamM8odz3NO5J+rtw8z19PtaLK8A/J7/q8daHcpRcojY8rN1ZnN/39tFwo90UBpTEt+UHHGSnpAuWOJfnvc44tpcFxBe3CNaltZLuLpN0iYl3atQDFZnsZs8B0vGQe+ZpkcVlEbE6znnJne7mkuojYlHYtlYDjCtqrKu0COgPbd9ne3fZuyl34vdj2l9Kuq9zZ7mP7+7Ybkp/rbPdJu64y96jturSLqCS2j5P0jKQbJP1Y0tO2359qUeXvKUnvS7uICsJxBe1CT2oBbD8ZEYfYPl/SWElXSJrL8BmlZftXyn2YNE1Le4Gk0RFxZst/hZ1he4mkAyU9p9zlLU3DfvFeL5Fk9JDzmsZgtl0j6ecRcWi6lZUv249IGqXcLFP510gyBFUJcFxBe3FNamG62e4m6XRJP4qIzbZJ96V3YESclbf8NdtPplZNZRifdgEVqFv+JCER8XRyvEHpfDXtAioMxxW0C6f7CzNF0kpJu0mabfsASVyTWnp/s31000IyuP/fUqyn7CXjFu4v6fjk8QZxnCi1Bts32z4u+blJUkPaRZWzZDzUlcp9QZilXI8qd/aXCMcVtBen+9vJdteIaEy7jnKWzOx1h6Sm61DXSPpYRMxPr6rylgzLUy9pWETU2N5X0i8j4qiUSytbtneRdImkpi9k/yPpxwzuXzq2PylpkqQ9I+LAZGjBn0TECSmXVpY4rqC9CKkFsn2KpBGSejSti4ir06uoctjeXZIYUaH0ksspxkialze72gKuHSud5IbMjRGxJVnuImmXiNiQbmXlK3mfj5P0Z2YRLD2OK2gvutsLYPsnks5RbrxUS/qIcnMQo4Rsf8v2+yJiXUSss72H7W+kXVeZ2xS5b64hvRugUFp/kLRr3vKukh5KqZZK8U7+8FO2u4pZBEuJ4wrahZBamCMj4kJJayLia5KO0N/HNETpnBwRbzYtRMQaSf+YYj2V4G7bUyS9Lzkl+pCkm1Kuqdz1iIi3mhaSxz1TrKcSzLL975J2tX2SpF9Kmp5yTeWM4wrahbv7C9N0s86G5Fqa1yX1T7GeStHF9i5N1+bZ3lXSLinXVNYi4nvJh/Y6ScMkXRkRv0+5rHL3tu2xTVNy2j5U3CBYaldI+oRy417/q6QZkm5OtaIyxnEF7cU1qQWw/Z+SfqjclHk3JKtvjoj/TK+q8mf7ckkTJN2WrPq4pGkRcW16VVU2249FxBFp11FObB8maaqkvyp3OdE+ks6JiLmpFlbBbP9qu+HvUEIcV9ASQmoBkh68iyUdo9w1Nf8j6f+LiI2pFlYBbI+XdGKy+PuImJlmPZXO9hNNNz6geJJxUZumjdxmWlTbJ9Hr1LF4n3cs9jdaQkgtgO27Ja2X9LNk1XmS+kTER9OrCnz77ni250XE2LTrqCTs847HPu9Y7G+0hGtSC3NwROTPO/yw7cWpVYMmPVpvAnR6TrsAAEgDd/cXZp7tw5sWbP+DmBEmCzgN0PEITB2P93nH433esdjfaBY9qTtge6FyHxDdJD1qe1WyfICkpWnWBqTkgrQLADrA5WkXUGE4rqBZhNQd+3DaBWCH+PZdZLbPlHSNpL7K7V9LiohomvXrqRTLq1Qr0y6g3Ng+StJVynU4dNXf3+dDlHvwu/SqKz8cV9Be3DiFTsv2wRzcisv2ckkTImJJ2rWUu+SDu0URcW9H1VJpbC+VdKmkuZK2NK2PiNdTK6qMcVxBe9GTisyxvV47uA6Pb98l9QofJB1mQvK7r6QjJf0xWf6ApEclEVJLZ21E/CbtIioIxxW0CyEVmRMRvSXJ9tclvSTpp8qdHjpfzPRVag22fyHpfknvNK2kV6/4IuLjkmT7d5LqIuKlZLm/pNtTLK0SPGz7u8p9Ech/n89Lr6SyxnEF7cLpfmSW7fkRMbq1dSge27c1szoi4p87vJgKYXtJRAzPW66StCh/HYrL9sPNrI6IOL7Di6kAHFfQXvSkIsvetn2+clNGhqRzJb2dbknlral3Dx3qD7ZnSvp5snyOpIdSrKfsRcQH0q6hknBcQXsxTiqy7DxJH5X0SvLzkWQdSsR2je0/2H4qWR5l+z/SrqucRcRkST+RNDr5uTEiPpNuVeXNdj/bt9j+TbJcZ/sTaddVrjiuoL043Q/gXbZnSfqSpClNc2nbfioiDk63svJm+wBJQyPiIds9JXWJiPVp11WuknB6m6SvRMRo210lPRERI1MurSxxXEF70ZOKzOLbdyp6RsTj261rTKWSCmH7k5LukTQlWbWfcjeYoHT2joi7JW2VpIhoVN5QVCg6jitoF0IqsuwmSV+WtFmSImKBpImpVlT+XrN9oJIhwGyfrdwICyidSyQdJWmdJEXEM8oNS4XSedv2Xvr7+/xwSWvTLamscVxBu3DjFLKsZ0Q8bm8zsRTfvkvrEkk3Sqq1/aKk55Qb+gul805EbGp6nyennrkOq7Q+L2mapANt/5+kaklnp1tSWeO4gnYhpCLL+Pbd8faIiBNt7yapKiLW2/6wpL+kXVgZm2X73yXtavskSZ+WND3lmsrdGknHShqm3BjMyyQdkmpF5Y3jCtqFG6eQWbaHKPft+0jlPlSek3R+RHBgKxHb8yRd2DSbl+2Jki6NiH9It7LylYyL+glJH1QuMM2UdHNwcC4Z23MlnRoRLybL75d0AzdOlQbHFbQXIRWZl//tO+1ayl3yxeAe5Yb6OkbShZI+HBFcr1dCtrtLqlXurMGyiNiUckllzfZhkn6s3NS0YyV9W7n3+fOpFlamOK6gvQipyKzkxoavSjpauQ/v/5V0dUS8nmphZc52jXJ3l6+SdEZE/C3lksqa7VOUGyf1WeV6UgdL+lfmli8t20coN6LCRkmnRMTqlEsqaxxX0B6EVGSW7d9Lmi3pZ8mq8yUdFxEnpldVebK9UNverNNXubud35GkiBiVRl2VwPZS5XqVlifLB0p6MCJq062s/Nierm3f53XKXee+RpIi4tQ06ipXHFewswipyKzmBnu2vZDrxoovGUy+RVwHXDq250TEYXnLlvR4/joUh+1jd7Q9ImZ1VC2VgOMKdhZ39yPLfpdcYH93sny2cjeVoMjyPyxsj1buujFJ+p+ImJ9OVeXN9pnJwwbbM5R7n4dy0//OSa2wMpYfQm33k9T0ReDxiHg1narKF8cV7Cx6UpE5ttcr92FtSbspmRVGuckn3oqI3dOqrdzZ/pykT0q6N1l1hnJzyf8wvarKk+3bdrQ9Ij7eUbVUGtsflfRdSY8od5w5RtKXIuKeNOsqVxxX0F6EVADvsr1A0hER8XayvJukx7h2DOXE9nxJJzX1ntqulvRQRIxOt7LyxHEF7cXpfmSa7VGSBinvvRoR97b4B9hZ1rZzmG9J1qFEbA+W9Bm9933OTTylU7Xd6f3XxTThpcRxBe1CSEVm2b5V0ihJi/T3U/6hv58yQvHdJunPtu9Llk+XdGuK9VSC+yXdotwsU1tbaYvi+K3tmZJ+niyfI4khv0qH4wrahdP9yCzbiyOiLu06Ko3tscqNTSvlbnB4Is16yp3tPzPzTsdLblzLf5/ft6P22DkcV9AehFRklu1bJF0XEYvTrqVS2P5pRFzQ2joUj+3zJA2V9Dsl40dKUkTMS62oMmf7moi4vLV1KA6OK2gvTvcjy/5b0mO2X1buw9uSgovtS2pE/oLtLpIOTamWSjFS0gWSjte2l7Ucn1pF5e8kSdsH0pObWYfi4LiCdiGkIstuUe7De6G4Vq+kbH9Z0r9L2tX2uqbVkjZJujG1wirDRyQNiYhNaRdS7mxfLOnTkoYkd5w36S3p/9KpqnxxXMHO4nQ/Msv2YxFxRNp1VBLb346IL+9g+4iIWNSRNZU72/dLmsRg8qVnu4+kPSR9W9IVeZvWR8Qbee32iIg1HV1fueK4gvYipCKzbP9Y0vuUu+s5/1o97u5Pie15ETE27TrKie1HlBvFYo62fZ8zBFVKeJ93LPY3WsLpfmTZrsp9aH8wbx1DUKWLsQ2L76tpF4D34H3esdjfaBYhFZnFtJCZxKmXIouIWbYPkDQ0Ih6y3VNSl7TrqnC8zzsW+xvNYoYNZJbtGtt/sP1UsjzK9n+kXRdQTLY/KekeSVOSVfspN8A/AFQ0Qiqy7CZJX5a0WZIiYoGkialWBO5AL75LJB0laZ0kRcQzkvqmWhE4/Vwkztm/lWYcV9AsTvcjy3pGxOP2Np8XjWkVU86S2WBa1DSwfEQc3jEVVZR3ImJT0/vcdldx+rNkkjE6F0VE7Q6andBR9ZS7iAjbM5QbD7ilNhxX0CxCKrLsNdsHKvnAtn22pJfSLalsXZf87iGpXtJ85XqTRklqkMRQYKUzy3bTWJInKTeO5/SUaypbEbHF9jLbAyNiVQtt3mhuPdptnu3DImJO2oWgc2EIKmSW7SHKDfh8pKQ1kp6TdH5E/CXVwsqY7XslfTUiFibLB0u6KiLOTrey8mW7StInlBvFwpJmSro5ODiXjO3ZksZIelzS203rGfarNGwvlXSQpL8ot7+ZPRAFIaQi82zvJqkqItZvt/6iiLgjpbLKku1FEbH9FIbvWYeOY/tXEXFW2nWUE9vHNrc+ImZ1dC2VIBm94j3ocEBrCKnotBgAuvhs/1y5no6fJavOl9QrIs5Nr6rKZvuJiBiTdh3AzrLdV7lLiiRJLV1uATTh7n50ZtyBW3wfl7RI0ueSn8XJOqSHnoQis3247Tm237K9yfaWvLnlUWS2T7X9jHKXbM2StFLSb1ItCp0CN06hM+PDu8giYqPtn0iaERHL0q4HKJEfKTec3S+Vu1HwQkk1qVZU3r4u6XBJD0XEGNsfkPRPKdeEToCeVHRm9KQWme1TJT0p6bfJ8iG2p6VbVcXjfV4CEbFcUpeI2BIRt0kan3ZNZWxzRLwuqcp2VUQ8rNyXA2CH6ElFZ/Z/aRdQhr4qaZykRyQpIp60PTjVispYMmbnf0fE+TtodnlH1VNBNtjuLulJ29cqN7QdnTal86btXpJmS7rT9qvKG1UBaAk3TiGzbO8i6SxJg5T3hSoirk6rpnJn+08RcXj+zTq2FzBUTOnY/l9Jx0cEs+50kORu81ckdZd0qaQ+kn6c9K6iyJIRWjYqd1bgfOX2951J7yrQInpSkWUPSForaa6kd1KupVIssn2epC62h0r6rKRHU66p3K2Q9H/JZRX5Y3Z+P72Sylve0EcbJX0tzVoqQUTk95oybCAKRkhFlg2ICK4T61ifkfQV5b4U/Fy5geW/nmpF5e/Z5KdKUu+Ua6kIto+SdJWkA7TtWZohadVUzmyfKekaSX2V601tGsx/91QLQ+Zxuh+ZZftGST9smv0IAIohmQHpUuXO0mxpWs/p59KwvVzShIhYknYt6FzoSUWWHS3pY7afU65nj6n0Ssx2jaQv6r3XAR+fVk3lzna1pMskjdC2A52zz0tnbUQwTmfHeYWAivagJxWZxVR6Hc/2fEk/0Xt7mOamVlSZs/07Sb9Q7svBpyRdJGl1RHBXf5HZbpqh7qOSuki6V3nXu0fEvDTqKlfJaX5JOlbSPpLu17b7+9406kLnQUhFptk+WtLQiLgt6XHqFRHPpV1XubI9NyIOTbuOStK0z/NHUbA9JyIOS7u2cmP74R1sDnqvi8v2bTvYHBHxzx1WDDolTvcjs2x/VbkBn4dJuk1SN+XmlD8qzbrK3HTbn5Z0n7bt8XgjvZLK3ubk90u2T5H0V0l7plhP2YqID6RdQyWJCKZUxk5h8GJk2RmSTlUyLE9E/FXc/VxqF0n6knLDTs1NfhpSraj8fcN2H0lfUO6U/83K3dSDErH9Ldvvy1vew/Y30qypnNm+o5n9fWuaNaFz4HQ/Msv24xExzva8iBibDAj9GDdOAdgZ+ZNV5K2bFxFjW/obtF8L+/s964DtcbofWXa37SmS3mf7k5L+WbleJhSZ7eMj4o95NzpsgxscSie51vqTeu+IClyvVzpdbO8SEe9Iku1dJe2Sck3lrMr2HhGxRpJs7ynyBwrAmwRZdp2kEyWtU+661CuVm/sZxXespD9KmtDMtlDuLmiUxgOS/kfSQ8obUQEldaekP+Td2PNxMRNSKV0n6THbv0yWPyLpmynWg06C0/3ILNu35vcm2e4l6YGIOCHFsoCisv1kRBySdh2VxvZ45b4ES9LvI2JmmvWUO9t1kppGT/hjRCzO2/ZuLyuQj5CKzLL9dUl7RcSnbe8h6UFJN0XEjoY1wU5K7jDffmD5q9OrqLwlN+w8GhEz0q4FObYfi4gj0q6jUnA9MFpCSEWm2b5W0u6SDpX0nYj4VcollTXbP5HUU9IHlLv+92xJj0fEJ1ItrAzZXq/cpRSWtJtyQ35tFvOap46bejoW+xstIaQic7a7eceS/lPS45J+K3ETTyk1DSif97uXpN9ExDFp1wZ0FHr2Ohb7Gy3hxilk0fY37zyh3ED+E8RNPKW2Mfm9wfa+kl6X1D/Fesqe7TOUu0ZvbbL8PknHRcT96VYGAOkipCJzmKUkVdOTkPRdSfOU+1JwU7ollb2vRsR9TQsR8WYy2xohNT1Ou4AKw/5Gs5hxCplle4Dt+2y/mvz8yvaAtOsqV7arJP0hIt5Mrv09QFJtRFyZcmnlrrnjMB0I6bog7QLKie0Dbe+SPD7O9mfzZ6CSxIgtaBYhFVl2m6RpkvZNfqYn61ACEbFV0g15y+80nYJGSTXY/n7yQX6g7e8rNx0tisz2etvrWvppahcRT6VZZxn6laQttg+SdKOk/SXd1bQxIt5IqzBkGyEVWVYdEbdFRGPyc7uk6rSLKnN/sH2WbU6/dZzPSNok6ReSpip3XfAlqVZUpiKidzJqwv+TdIWk/SQNkHS5pP9Ks7YytzUi/v/27jxWrrIO4/j3ubUELK0p0cRELAKyimU1bS0iYtwiimAAWaUQUSEs4oKgEWOiBVkMFoEQoLIJSIKKGDQRjSIgWAuIgjVAgcRIXIBQK4KFxz/OGTvU2xbCzLznnHk+CaHnvSV5Mkzm/s477+93VgH7Aotsf5acdY8XId390ViSbqbaOb26XjoIWJBh/sNTj0WaBqyiKpYyDqkwSYtsH1c6R5dIusf2jutbi8GQdAfVTcAXgA/YXi7p97Z3KBwtGi47qdFkRwIHAI8Bf6Ga2XlEyUBdV+80TdjewPaMvp2nKGd+6QAdtFLSIZKmSJqQdAiwsnSoDlsAzAO+WheomwNXFM4ULZCd1GgsSfNt37q+tRgcSTevuVM92VqMTmZIDp6kN1B95T+faoLFrcCJth8ul6rbJG0AbF1fLrP9n5J5oh3SQRpNtghY85fzZGvxMknakOpJU6+uH0HbO5M6g+rcXkRn1MXoPqVzjAtJewKXAQ9Tfba8XtJHbf+yZK5ovhSp0TiS5gFvBV4j6aS+H80AppRJ1XkfB06kmqKwtG/9KeC8IomiJ01sAyLpc7a/LmkR1Q7qC9g+vkCscXA28G7bywAkbU3Va7Br0VTReClSo4k2ADamen9O71t/iupcagyY7XOBcyUdZ3tR6TzxAueWDtAh99f/XlI0xfiZ2itQAWz/SdLUkoGiHXImNRpL0ma2H1nHz9P1PGCSpgGfAmbZPlrSVsA2tm8sHK2zJO1G1fW8GdWNWW+iwuyiwTpK0hTgDNufKZ1lXEi6FHgeuLJeOgSYYvvIcqmiDVKkRmuloWTwJF1LNUj+cNs7SHolcJvtnQpH6yxJy4DPAvdS/SIHYF03aPHySLrd9rzSOcZF/bSpY4Hd66VbgPNtP1MuVbRBvu6PiH5b2j5Q0kEAtv+Vwf5D9zfbN5QOMWbulnQDcB19o6dsX18uUnfZfkbSecDNVDdiy2w/WzhWtECK1Ijo96ykjaibSiRtCWS3Y7hOk3Qx1S/w/73WKZiGakPgH8BefWsG8poPgaT3AxcCD1IdZ9lc0sdt31Q2WTRditRos+zwDd5pwI+pRsRcRTVH8oiiibpvAbAtMJXVX/enYBoi2wtKZxgzZwPvsP0A/O/m90dAitRYpxSp0Wbpeh4gSRPATGA/YC7VTcAJtv9eNFj3vcX2NqVDjBNJW1B9fsyluiG4nWqY//KiwbprRa9ArT0ErCgVJtojjVPRWOl6Hj1JS2zvVjrHOJG0GDjT9n2ls4wLSb8GvkU1qxPgI8BxtueUS9Vdki6g+hz/LtVNwf7Ao8BPIUdbYu1SpEZjpet59CSdDvwduJYXNpQ8XixUx0m6H9gSWE51JjU3Y0Mm6Xdrvr6S7rG9Y6lMXVbfiK2NM4oq1iZFajSWpF/Z3n39fzMGRdJyJn8SzxYF4owFSZtNtp6bscGTtEn9x5OBJ4BrqN7vBwIzbZ9SKts4k3SK7YWlc0TzpEiNxpL0TuAg0vU8MnVn/zFU8wxNNc/wQttPFw3WcZJ2BN5WX95i+56Sebqq7yZssqZL52asjMy8jrVJ41Q0WbqeR+8yqsfPfrO+PrheO6BYoo6TdALwMVa/r6+UdFEeTzt4tjcvnSEmlUktManspEZjSVqWrufRknSf7e3XtxaDI+l3wDzbK+vracDtOZM6PJJ+C1wCfMf2k6XzjLvspMbaTJQOELEOt0lKcTRaSyXN7V1ImgMsKZhnHAh4ru/6ObKzNGwHAq8Dlki6RtJ78mS1ovLax6SykxqNla7n0atf822oxsMAzAKWAavIaz8Ukk4CPgp8r176EHCZ7W+USzUe6tnAewMXUN0cLAbOzTSL0ZJ0wh2/hgAABPRJREFUqu2vlc4RzZMiNRorXc+jt7bXvCev/XBI2oWqWQ2qxqm7SuYZB5JmA0cC7wN+AlxF9f/gMNs7lczWFZIWMcm0kB7bx48wTrRQGqeisWw/kq7n0UoROnqSrrB9GLB0krUYgvpM6pPAxcDJtnvTQ+6QNL9css7pHRWaD2xPNX8ZqmH+eXhFrFd2UqOxJul63hdI13N0yppNI5KmAPemWW146rPuO7P6aXYA2P5KsVAdVj/ha3fbq+rrqVSbDnPX/V/GuMtOajTZUcCcvq7nM6iesZ0iNVpP0inAqcBGkp7qLQPPAhcVCzYezqHaSV1K3wzmGJqZwAygd9Z343otYp1SpEaTpes5Oqt+ws5CSQvzpKOR29T2e0uHGCOnA3dJ+jnVZ/gewJeLJopWSJEaTbaY6oxYf9fzpQXzRAzDjZKm2V4p6VBgF6oO85wPHp7bJL3Z9r2lg4wD24sl3QTMqZdOtv1YyUzRDjmTGo2WrufounqY/47AbODbVM08B9h+e8lcXSTpXqpu81cAWwEPkfF2QyNpW9t/rD/H/4/tpZOtR/SkSI3GmqzDOV3P0TW9xilJXwL+bPuSPIFnODJibbTqx/seXX/Nvybb3mvkoaJV8nV/NNmb+i/qruddC2WJGJYVdRPVocAe9YD5qYUzdVKK0NGqC9QJ4Iu2by2dJ9onj0WNxpF0iqQVwGxJT9X/rAD+CvygcLyIQTuQ6ivno+pzepsCZ5aNFDEYtp8HziudI9opX/dHY6XrOSKi/SSdRTU+8Hqn6IiXIEVqNFb95Je70/UcXVZ/S9D7IN6A6qv+f9p+VblUEYNTv8enUY0RfJrVjWozigaLxsvX/dFkFwD/qh+N+mngQeDyspEiBsv2dNsz6l/YGwEfpnrvR3RC/R6fsD21fq9PT4EaL0Z2UqOx0vUc40rSXbZ3Lp0jYlAk7Uc1TtBU4wS/XzhStEC6+6PJ0vUcnVf/8u6ZAHYD/l0oTsTASTofeCNwdb30CUnvsn1swVjRAtlJjcaS9FrgYOA3tm+RNAvY03a+8o/OkLS473IV8DBwke2/lUkUMViS/ghs12uaqjcc/mB7u7LJoumykxqNVY/jOafv+lFyJjW6ZwI4wfaTAJJmAmcDRxZNFTE4DwCzgF7T6+vrtYh1SpEajZWu5xgTs3sFKoDtJyTlPGq0nqQfUn2GTwful3RnfT0HuLNktmiHFKnRWLan9/4sScA+wNxyiSKGYkLSTNtPAEjahHw2RzecVTpAtFvOpEarpOs5ukbS4cCpwHX10v7AV21fUS5VRER5KVKjsdbS9fx22/MKRYoYCknbA3vVlz+zfV/JPBGDIOlXtndf4+gWZJh/vEgpUqOx0vUcERExvnLuKZosXc8RES0maQrVuKltS2eJ9sljUaPJ/q/rGch51IiIlrD9HLCsnnMd8ZJkJzWaLF3PERHtNxP4Qz2CamVv0fYHy0WKNsgv/Giys4HbJb2g67lgnoiIeOk2BPbuuxZwRqEs0SIpUqOxbF8uaQmru573S9dzRETrvML2L/oXJG1UKky0R7r7IyIiYuAkfRI4BtgCeLDvR9OBW20fWiRYtEaK1IiIiBg4Sa+iOo+6EPh8349W2H68TKpokxSpEREREdE4GUEVEREREY2TIjUiIiIiGidFakREREQ0TorUiIiIiGicFKkRERER0Tj/Bef1WNN+5uZTAAAAAElFTkSuQmCC\n",
            "text/plain": [
              "<Figure size 720x504 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": [],
            "needs_background": "light"
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "PI36lZHpcE53"
      },
      "source": [
        "Since the [*PubMed 200k RCT:\n",
        "a Dataset for Sequential Sentence Classification in Medical Abstracts*](https://arxiv.org/pdf/1710.06071.pdf) paper compares their tested model's F1-scores on the test dataset, let's take at our model's F1-scores.\n",
        "\n",
        "> 🔑 **Note:** We could've also made these comparisons in TensorBoard using the [`TensorBoard`](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/TensorBoard) callback during training."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "PtDMFKsCcD1j",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 571
        },
        "outputId": "8a8cbc5f-adc6-4d22-dca8-c7bf1c05b2e3"
      },
      "source": [
        "# Sort model results by f1-score\n",
        "all_model_results.sort_values(\"f1\", ascending=False)[\"f1\"].plot(kind=\"bar\", figsize=(10, 7));"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAlMAAAIqCAYAAAAAbM/hAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3debgkZX3+//fNAIoIinHMwi4BDFEUHFHEFTXBIBB3kLhHkijqV/M1oskXDCZxi+ZnDEZRQaNRxH2IGFRUNEqEYREEnDgCKmgUEYVoBMHP74+qI80wS0t1d51T/X5d17nmVHXJ3On01LlP1fM8lapCkiRJt80mfQeQJElayixTkiRJHVimJEmSOrBMSZIkdWCZkiRJ6sAyJUmS1MGmff3Fd73rXWunnXbq66+XJEka2znnnPODqlq+rtd6K1M77bQTq1at6uuvlyRJGluSb67vNW/zSZIkdWCZkiRJ6sAyJUmS1IFlSpIkqQPLlCRJUgeWKUmSpA4sU5IkSR1YpiRJkjqwTEmSJHVgmZIkSerAMiVJktSBZUqSJKkDy5QkSVIHlilJkqQOLFOSJEkdWKYkSZI62LTvANOw01Ef7zvCbXb5qw/sO4IkSfoVeGVKkiSpA8uUJElSB5YpSZKkDixTkiRJHVimJEmSOrBMSZIkdWCZkiRJ6mCsMpXkgCSrk6xJctQ6Xt8hyWeTnJfkgiR/MPmokiRJi89Gy1SSZcBxwKOBPYDDkuyx1mF/BZxcVXsBhwJvnnRQSZKkxWicK1P7AGuq6tKqugE4CThkrWMK2Lr9/k7AdyYXUZIkafEa53Ey2wLfHtm+Arj/Wse8AvhkkucDWwKPnEg6SZKkRW5SA9APA95ZVdsBfwC8O8mt/ttJjkiyKsmqq666akJ/tSRJUn/GKVNXAtuPbG/X7hv1bOBkgKo6E7g9cNe1/0NVdXxVraiqFcuXL79tiSVJkhaRccrU2cCuSXZOsjnNAPOVax3zLeARAEl+h6ZMeelJkiQN3kbLVFXdCBwJnAZcQjNr76IkxyY5uD3sz4HnJPkK8D7gGVVV0wotSZK0WIwzAJ2qOhU4da19R498fzGw32SjSZIkLX6ugC5JktSBZUqSJKmDsW7zSRuz01Ef7zvCbXb5qw/sO4IkaQnzypQkSVIHlilJkqQOLFOSJEkdWKYkSZI6sExJkiR1YJmSJEnqwDIlSZLUgWVKkiSpA8uUJElSB5YpSZKkDixTkiRJHVimJEmSOrBMSZIkdWCZkiRJ6sAyJUmS1IFlSpIkqQPLlCRJUgeWKUmSpA4sU5IkSR1s2ncASbfNTkd9vO8It9nlrz6w7wiSNDFemZIkSerAMiVJktSBZUqSJKkDy5QkSVIHlilJkqQOLFOSJEkdWKYkSZI6sExJkiR1YJmSJEnqwDIlSZLUgWVKkiSpA5/NJ0lj8nmIktZlrCtTSQ5IsjrJmiRHreP1f0hyfvv1X0l+NPmokiRJi89Gr0wlWQYcBzwKuAI4O8nKqrp44ZiqetHI8c8H9ppCVkmSpEVnnCtT+wBrqurSqroBOAk4ZAPHHwa8bxLhJEmSFrtxytS2wLdHtq9o991Kkh2BnYHPdI8mSZK0+E16APqhwAer6qZ1vZjkCOAIgB122GHCf7UkaWgc9K+lYJwrU1cC249sb9fuW5dD2cAtvqo6vqpWVNWK5cuXj59SkiRpkRqnTJ0N7Jpk5ySb0xSmlWsflOQewDbAmZONKEmStHhttExV1Y3AkcBpwCXAyVV1UZJjkxw8cuihwElVVdOJKkmStPiMNWaqqk4FTl1r39Frbb9icrEkSZKWBh8nI0mS1IFlSpIkqQPLlCRJUgeWKUmSpA4sU5IkSR1YpiRJkjqwTEmSJHVgmZIkSerAMiVJktSBZUqSJKkDy5QkSVIHlilJkqQOLFOSJEkdWKYkSZI6sExJkiR1YJmSJEnqwDIlSZLUgWVKkiSpA8uUJElSB5YpSZKkDixTkiRJHVimJEmSOti07wCSJGnx2Omoj/cd4Ta7/NUH9vL3emVKkiSpA8uUJElSB5YpSZKkDixTkiRJHVimJEmSOrBMSZIkdWCZkiRJ6sAyJUmS1IFlSpIkqQPLlCRJUgeWKUmSpA4sU5IkSR2MVaaSHJBkdZI1SY5azzFPSnJxkouSvHeyMSVJkhanTTd2QJJlwHHAo4ArgLOTrKyqi0eO2RV4GbBfVV2T5G7TCixJkrSYjHNlah9gTVVdWlU3ACcBh6x1zHOA46rqGoCq+v5kY0qSJC1O45SpbYFvj2xf0e4btRuwW5IvJvnPJAdMKqAkSdJittHbfL/Cf2dX4GHAdsDnk9yrqn40elCSI4AjAHbYYYcJ/dWSJEn9GefK1JXA9iPb27X7Rl0BrKyqn1fVZcB/0ZSrW6iq46tqRVWtWL58+W3NLEmStGiMU6bOBnZNsnOSzYFDgZVrHfNRmqtSJLkrzW2/SyeYU5IkaVHaaJmqqhuBI4HTgEuAk6vqoiTHJjm4Pew04OokFwOfBV5SVVdPK7QkSdJiMdaYqao6FTh1rX1Hj3xfwIvbL0mSpLnhCuiSJEkdWKYkSZI6sExJkiR1YJmSJEnqwDIlSZLUgWVKkiSpA8uUJElSB5YpSZKkDixTkiRJHVimJEmSOrBMSZIkdWCZkiRJ6sAyJUmS1IFlSpIkqQPLlCRJUgeWKUmSpA4sU5IkSR1YpiRJkjqwTEmSJHVgmZIkSerAMiVJktSBZUqSJKkDy5QkSVIHlilJkqQOLFOSJEkdWKYkSZI6sExJkiR1YJmSJEnqwDIlSZLUgWVKkiSpA8uUJElSB5YpSZKkDixTkiRJHVimJEmSOrBMSZIkdTBWmUpyQJLVSdYkOWodrz8jyVVJzm+//njyUSVJkhafTTd2QJJlwHHAo4ArgLOTrKyqi9c69P1VdeQUMkqSJC1a41yZ2gdYU1WXVtUNwEnAIdONJUmStDSMU6a2Bb49sn1Fu29tj09yQZIPJtl+Xf+hJEckWZVk1VVXXXUb4kqSJC0ukxqAfgqwU1XtCXwKeNe6Dqqq46tqRVWtWL58+YT+akmSpP6MU6auBEavNG3X7vulqrq6qq5vN98O3Hcy8SRJkha3ccrU2cCuSXZOsjlwKLBy9IAkvzmyeTBwyeQiSpIkLV4bnc1XVTcmORI4DVgGnFBVFyU5FlhVVSuBFyQ5GLgR+CHwjClmliRJWjQ2WqYAqupU4NS19h098v3LgJdNNpokSdLi5wrokiRJHVimJEmSOrBMSZIkdWCZkiRJ6sAyJUmS1IFlSpIkqQPLlCRJUgeWKUmSpA4sU5IkSR1YpiRJkjqwTEmSJHVgmZIkSerAMiVJktSBZUqSJKkDy5QkSVIHlilJkqQOLFOSJEkdWKYkSZI6sExJkiR1YJmSJEnqwDIlSZLUgWVKkiSpA8uUJElSB5YpSZKkDixTkiRJHVimJEmSOrBMSZIkdWCZkiRJ6sAyJUmS1IFlSpIkqQPLlCRJUgeWKUmSpA4sU5IkSR1YpiRJkjoYq0wlOSDJ6iRrkhy1geMen6SSrJhcREmSpMVro2UqyTLgOODRwB7AYUn2WMdxWwEvBL486ZCSJEmL1ThXpvYB1lTVpVV1A3AScMg6jnsl8BrgZxPMJ0mStKiNU6a2Bb49sn1Fu++XkuwNbF9VH59gNkmSpEWv8wD0JJsAbwD+fIxjj0iyKsmqq666qutfLUmS1LtxytSVwPYj29u1+xZsBdwT+FySy4EHACvXNQi9qo6vqhVVtWL58uW3PbUkSdIiMU6ZOhvYNcnOSTYHDgVWLrxYVT+uqrtW1U5VtRPwn8DBVbVqKoklSZIWkY2Wqaq6ETgSOA24BDi5qi5KcmySg6cdUJIkaTHbdJyDqupU4NS19h29nmMf1j2WJEnS0uAK6JIkSR1YpiRJkjqwTEmSJHVgmZIkSerAMiVJktSBZUqSJKkDy5QkSVIHlilJkqQOLFOSJEkdWKYkSZI6sExJkiR1YJmSJEnqwDIlSZLUgWVKkiSpA8uUJElSB5YpSZKkDixTkiRJHVimJEmSOrBMSZIkdWCZkiRJ6sAyJUmS1IFlSpIkqQPLlCRJUgeWKUmSpA4sU5IkSR1YpiRJkjqwTEmSJHVgmZIkSerAMiVJktSBZUqSJKkDy5QkSVIHlilJkqQOLFOSJEkdWKYkSZI6sExJkiR1MFaZSnJAktVJ1iQ5ah2v/2mSC5Ocn+Q/kuwx+aiSJEmLz0bLVJJlwHHAo4E9gMPWUZbeW1X3qqr7AK8F3jDxpJIkSYvQOFem9gHWVNWlVXUDcBJwyOgBVXXtyOaWQE0uoiRJ0uK16RjHbAt8e2T7CuD+ax+U5HnAi4HNgf0nkk6SJGmRm9gA9Ko6rqp2AV4K/NW6jklyRJJVSVZdddVVk/qrJUmSejNOmboS2H5ke7t23/qcBPzhul6oquOrakVVrVi+fPn4KSVJkhapccrU2cCuSXZOsjlwKLBy9IAku45sHgh8fXIRJUmSFq+NjpmqqhuTHAmcBiwDTqiqi5IcC6yqqpXAkUkeCfwcuAZ4+jRDS5IkLRbjDECnqk4FTl1r39Ej379wwrkkSZKWBFdAlyRJ6sAyJUmS1IFlSpIkqQPLlCRJUgeWKUmSpA4sU5IkSR1YpiRJkjqwTEmSJHVgmZIkSerAMiVJktSBZUqSJKkDy5QkSVIHlilJkqQOLFOSJEkdWKYkSZI6sExJkiR1YJmSJEnqwDIlSZLUgWVKkiSpA8uUJElSB5YpSZKkDixTkiRJHVimJEmSOrBMSZIkdWCZkiRJ6sAyJUmS1IFlSpIkqQPLlCRJUgeWKUmSpA4sU5IkSR1YpiRJkjqwTEmSJHVgmZIkSerAMiVJktSBZUqSJKmDscpUkgOSrE6yJslR63j9xUkuTnJBktOT7Dj5qJIkSYvPRstUkmXAccCjgT2Aw5LssdZh5wErqmpP4IPAaycdVJIkaTEa58rUPsCaqrq0qm4ATgIOGT2gqj5bVT9tN/8T2G6yMSVJkhanccrUtsC3R7avaPetz7OBT3QJJUmStFRsOsn/WJI/AlYAD13P60cARwDssMMOk/yrJUmSejHOlakrge1Htrdr991CkkcCfwkcXFXXr+s/VFXHV9WKqlqxfPny25JXkiRpURmnTJ0N7Jpk5ySbA4cCK0cPSLIX8FaaIvX9yceUJElanDZapqrqRuBI4DTgEuDkqrooybFJDm4Pex1wR+ADSc5PsnI9/zlJkqRBGWvMVFWdCpy61r6jR75/5IRzSZIkLQmugC5JktSBZUqSJKkDy5QkSVIHlilJkqQOLFOSJEkdWKYkSZI6sExJkiR1YJmSJEnqwDIlSZLUgWVKkiSpA8uUJElSB5YpSZKkDixTkiRJHVimJEmSOrBMSZIkdWCZkiRJ6sAyJUmS1IFlSpIkqQPLlCRJUgeWKUmSpA4sU5IkSR1YpiRJkjqwTEmSJHVgmZIkSerAMiVJktSBZUqSJKkDy5QkSVIHlilJkqQOLFOSJEkdWKYkSZI6sExJkiR1YJmSJEnqwDIlSZLUgWVKkiSpA8uUJElSB2OVqSQHJFmdZE2So9bx+kOSnJvkxiRPmHxMSZKkxWmjZSrJMuA44NHAHsBhSfZY67BvAc8A3jvpgJIkSYvZpmMcsw+wpqouBUhyEnAIcPHCAVV1efvaL6aQUZIkadEa5zbftsC3R7avaPdJkiTNvZkOQE9yRJJVSVZdddVVs/yrJUmSpmKcMnUlsP3I9nbtvl9ZVR1fVSuqasXy5ctvy39CkiRpURmnTJ0N7Jpk5ySbA4cCK6cbS5IkaWnYaJmqqhuBI4HTgEuAk6vqoiTHJjkYIMn9klwBPBF4a5KLphlakiRpsRhnNh9VdSpw6lr7jh75/mya23+SJElzxRXQJUmSOrBMSZIkdWCZkiRJ6sAyJUmS1IFlSpIkqQPLlCRJUgeWKUmSpA4sU5IkSR1YpiRJkjqwTEmSJHVgmZIkSerAMiVJktSBZUqSJKkDy5QkSVIHlilJkqQOLFOSJEkdWKYkSZI6sExJkiR1YJmSJEnqwDIlSZLUgWVKkiSpA8uUJElSB5YpSZKkDixTkiRJHVimJEmSOrBMSZIkdWCZkiRJ6sAyJUmS1IFlSpIkqQPLlCRJUgeWKUmSpA4sU5IkSR1YpiRJkjqwTEmSJHVgmZIkSepgrDKV5IAkq5OsSXLUOl6/XZL3t69/OclOkw4qSZK0GG20TCVZBhwHPBrYAzgsyR5rHfZs4Jqq+m3gH4DXTDqoJEnSYjTOlal9gDVVdWlV3QCcBByy1jGHAO9qv/8g8IgkmVxMSZKkxSlVteEDkicAB1TVH7fbTwXuX1VHjhzz1faYK9rtb7TH/GCt/9YRwBHt5u7A6kn9HzJjdwV+sNGjNEm+57Pnez57vuez53s+e0v1Pd+xqpav64VNZ5miqo4Hjp/l3zkNSVZV1Yq+c8wT3/PZ8z2fPd/z2fM9n70hvufj3Oa7Eth+ZHu7dt86j0myKXAn4OpJBJQkSVrMxilTZwO7Jtk5yebAocDKtY5ZCTy9/f4JwGdqY/cPJUmSBmCjt/mq6sYkRwKnAcuAE6rqoiTHAquqaiXwDuDdSdYAP6QpXEO25G9VLkG+57Pnez57vuez53s+e4N7zzc6AF2SJEnr5wrokiRJHVimJEmSOrBMSZIkdWCZkiRJ6mCmi3YuNUlevKHXq+oNs8oyT3zfZy/J4zb0elV9eFZZ5oWf8/4keRCwa1WdmGQ5cMequqzvXEOU5EJgvTPdqmrPGcaZGsvUhm3V/rk7cD9uXl/rIOCsXhLNB9/32Tuo/fNuwAOBz7TbDwe+BFimJs/PeQ+SHAOsoHnfTwQ2A94D7NdnrgF7TPvn89o/393+eXgPWabGpRHGkOTzwIFVdV27vRXw8ap6SL/Jhs33ffaSfBJ4elV9t93+TeCdVfX7/SYbLj/ns5XkfGAv4Nyq2qvdd8FQrpAsVknOW3i/R/adW1V795VpkhwzNZ5fB24Y2b6h3afp8n2fve0XilTre8AOfYWZE37OZ+uG9gkdBZBky57zzIsk2W9k44EMqIN4m288/wKcleQj7fYfAu/qMc+8WNf7/s7+4syF05OcBryv3X4y8Oke88wDzy+zdXKStwJ3TvIc4FnA23rONA+eDZyQ5E7t9o9o3vtB8DbfmJLsDTy43fx8VZ3XZ5554fs+e0keCyzcYvp8VX1kQ8erOz/ns5XkUcDvAQFOq6pP9RxpbiyUqar6cd9ZJskyNSZnfywOSe5YVf/Td44hS7IjzWf900nuACxbGM+j6fD8onmU5JlVdWLfOSZhMPcrp6md/fFS4GXtroXZH5q9i/sOMGTtbY8PAm9td20LfLS/RMPn+WW2kjwuydeT/DjJtUmuS3Jt37nm1F/3HWBSHDM1nsfSzv4AqKrvtDNuNAUbWH8nwB1nmWUOPQ/YB/gyQFV9Pcnd+o00eJ5fZuu1wEFVdUnfQeZBkgvW9xIDmmhhmRrPDVVVSZz9MRt/B7wOuHEdr3k1dbqur6obkgCQZFM2sOCeJsLzy2x9zyI1U78O/D5wzVr7Q7OG3SBYpsbj7I/ZOhf4aFWds/YLSf64hzzz5IwkLwe2aAfpPhc4pedMQ+f5ZbZWJXk/ze3r6xd2usr/1PwbzRjA89d+IcnnZh9nOhyAPqaR2R8An3T2x/Qk2R24uqp+sI7Xfr2qvtdDrLmQZBOaKcy/nOkEvL08UUyV55fZSbKuAc9VVYOZpq/Zs0yNKclv0IwlKeDsqvrvniMNXpK9q+rcvnPMmySbA/eg+ayvrqobNvI/UUeeXzR0Sf4ROKmqBnNrb5RlagztraWjaZ5XFuChwLFVdUKvwQYuyWeB36CZXfb+qvpqz5EGL8mBwFuAb9B81ncG/qSqPtFrsAHz/DIbSf6iql6b5E2sYxxgVb2gh1hzI8nTaRYB3h34CE2xWtVvqsmxTI0hyWrggVV1dbv9a8CXqmr3fpMNX/sb+5No/hFuTVOq/qbfVMOV5GvAY6pqTbu9C81z4u7Rb7Lh8vwyG0kOqqpT2h/qt1JVrjo/A0nuAjweOBTYoap27TnSRDgAfTxXA6OLFl7X7tOUtbc7/rG9SvUXNL/BW6am57qFItW6lFt+9jV5nl9moKpOaf+0NPXrt2mGEewIDGZWpWVqA0bWO1oDfDnJx2guDx8CrG/tDE1Ikt+huSL1eJofLu8H/rzXUAOV5HHtt6uSnAqcTPNZfyJwdm/BBszzy2wlOYUNLPNRVQfPMM7cSfJamjXVvkFzLn9lVf2o31STY5nasIWF877Rfi34WA9Z5tEJwEnA71fVd/oOM3AHjXz/PZpxOwBXAVvMPs5c8PwyW3/fd4A59w1g33XN0h4Cx0xJkuZKki1oxuus7jvLPEmyLc3tvV9eyKmqz/eXaHK8MjWGJCuAv+TWH4I9ews1B5LsB7yCm9/30KwHc/c+cw1Zkp2B5wM7ccvPurdApsTzy2wlOYjmKtXmwM5J7kMze9LP+BQleTXNoPOLgZva3QUMokx5ZWoM7WyblwAXAr9Y2F9V3+wt1BxoZ5a9CDiHm//xsTDrSZOX5CvAO7j1Z/2M3kINnOeX2UpyDrA/8Lmq2qvdd2FV3avfZMPWfs73rKrrN3rwEuSVqfFcVVUr+w4xh37s+kYz97Oq+se+Q8wZzy+z9fOq+vHC8ydbXlWYvkuBzRh5hM+QWKbGc0yStwOn47OcZumzSV4HfJhbvu+uij49b0xyDPBJfM9nxfPLbF2U5CnAsiS7Ai9gQA/cXcR+CpyfZO3P+SAWS7VMjeeZNOtibMbNl+GL5oe8puf+7Z8rRvYVzSV6Tce9gKfSvMejn3Xf8+nx/DJbz6cZo3Y98D6a50++stdE82Fl+zVIjpkaQ5LVrkaseZBkDbCHz+ObHc8v/UmyDNiyqq7tO8s8aJ/7uVu7ubqqft5nnknapO8AS8SXkuzRd4h5k+ROSd6QZFX79fokd+o718B9Fbhz3yHmjOeXGUry3iRbJ9mSZtD/xUle0neuoUvyMODrwHHAm4H/SvKQXkNNkFemxpDkEmAX4DKaS8MLU/SdujxFST5E88N94fEPTwXuXVWPW///Sl0k+RywJ82q56PjGpw2PiWeX2YryflVdZ8khwN7A0cB5/h+T1c7i/IpC2t7JdkNeF9V3bffZJPhmKnxHNB3gDm1S1U9fmT7r5Oc31ua+XBM3wHmkOeX2dosyWbAHwL/VFU/T+JVhenbbHSR1Kr6r/b/D4Pgbb4xtOu9bA/s337/U3zvZuF/kzxoYaNdxPN/e8wzeO16UpfTnPjOoLlC5Uy+KfL8MnNvpfmMbwl8PsmOgGOmpm9VkrcneVj79TZgVd+hJsXbfGNop4qvAHavqt2S/Bbwgarar+dog9auTPwuYGGc1DXAM6rqK/2lGrYkzwGOAO5SVbu0U8ffUlWP6DnaYHl+6V+STavqxr5zDFmS2wHPAxZ+Qf4C8OahLOJpmRpDe2tpL+DckRVzL/Ae+2wk2RrAGTfT137W9wG+7OrQs+H5ZfaSHAj8LnD7hX1VdWx/iYavHfD/s6q6qd1eBtyuqn7ab7LJ8FLyeG6opnUW/PJDoSlL8ndJ7lxV11bVtUm2SfI3fecauOtHl0VIsimuDj1tnl9mKMlbgCfTrDcV4Ik0z0XUdJ0ObDGyvQXw6Z6yTJxlajwnJ3krcOf2Nsingbf1nGkePLqqfrSwUVXXAH/QY555cEaSlwNbJHkU8AHglJ4zDZ3nl9l6YFU9Dbimqv4a2Jeb1z7S9Ny+qv5nYaP9/g495pkoZ/ONoar+vv3Bci2wO3B0VX2q51jzYFmS2y3cU0+yBXC7njMN3VHAs2nW3/kT4FTg7b0mGjjPLzO3MInlp+34tKuB3+wxz7z4SZK9Fx5NleS+DGhCkWOmJiDJmVW1b985hibJS4GDgBPbXc8EVlbVa/tLNd+SfGit5So0ZZ5fJivJ/wPeRPOIpOPa3W+vqv/XX6rhS3I/4CTgOzS3V38DeHJVndNrsAmxTE1AkvMWBo5qspIcADyy3fxUVZ3WZ55552d99nzPJ6u9wv1nwINpxql9AfjnqvpZr8HmQLuu1MKjk27xOJkkj1rKV2QtUxOQ5Nyq2rvvHPPG39hnz8/67PmeT1aSk4HrgPe0u54C3KmqntRfKi31z7ljprSU3X7jh0jSLdyzqkafhfjZJBf3lkYL0neALpzNNxlL+kOwhHlZdfb8rM+e7/lknZvkAQsbSe7PgFbiXsKW9PncK1OT8dS+A0gz8tK+A8whzy8TkORCmh/YmwFfSvKtdntH4Gt9ZtPSZ5kaQ5LHAa8B7kbzW+LCU90XVub+ao/x5pm/sU9Y+/zDV9D8gNmUmz/rd6f55pP9pRsmzy8z85i+A2iDLu87QBcOQB9DkjXAQVV1Sd9ZdLMk9/QHzWQl+RrwIuAc4KaF/VV1dW+hBs7zi4as/WVhvarqw7PKMk1emRrP9zzRzU6S69jA/XN/Y5+qH1fVJ/oOMWc8v2jIDmr/vBvwQOAz7fbDgS8Blqk5sirJ+4GPAr98wvVQGvViU1VbASR5JfBd4N00tz4Ox5WKp+2zSV5Hc4Ib/ayf21+kwfP8osGqqmcCJPkksEdVfbfd/k3gnT1Gmyhv840hyYnr2F1V9ayZh5kjSb5SVffe2D5NTpLPrmN3VdX+Mw8zJzy/aB4kuaSqfmdkexPgotF9S5lXpsaw0Kw1cz9JcjjNIwgKOAz4Sb+Rhq2qHt53hnnj+UVz4vQkpwHva7efTPNQ70FwnakxJNktyelJvtpu75nkr/rONQeeAjwJ+F779cR2n6Ykya8neUeST7TbeyR5dt+5hszzi+ZBVR0JvAW4d/t1fFU9v99Uk+NtvjEkOQN4CfDWhWdkJflqVd2z32TSZLUl6kTgL6vq3kk2Bc6rqnv1HG2wPL9oXiTZEdi1qj6d5A7Asqq6ru9ck+CVqfHcoarOWmvfjb0kmSP+xt6Lu1bVycAvAKrqRkaWSNBUeH7R4CV5DvBB4K3trm1pJl0MgmVqPD9IsgvtdP0kT6CZZabpehvwMuDnAFV1AXBor4mG7ydJfo2bP+sPAH7cb/GmaTIAABbySURBVKTB8/yiefA8YD/gWoCq+jrNcgmD4AD08TwPOB64R5Irgctopulruu5QVWclt1jo3N/Yp+vFwEpglyRfBJYDT+g30uB5ftE8uL6qblg4n7dDCAYzzsgyNZ5tquqRSbYENqmq65I8Bvhm38EGzt/YZ+8a4KHA7jRre60G7tNrouHz/KJ5cEaSlwNbJHkU8FzglJ4zTYwD0MeQ5FzgaQsrbic5FHhRVd2/32TDluTuNL+xP5Dmh/xlwOFV5Q+ZKUlyDnBwVV3Zbj8EOM4B6NPj+UXzoF1X6tnA79H8onYa8PYaSAmxTI2h/aH+QZpp+Q8GngY8pqocSzIDo7+x951l6JLcD3gzzSMg9gZeRfNZ/3avwQbM84vmRZLNgXvQ3G1YXVU39BxpYixTY0qyG83Mg28Bj62q/+050uC1A6GPAR5E84/vP4BjfejudCXZl2bGzc+AA6vqqp4jDZ7nFw1dkgNp1pn6Bs2VqZ2BPxnKs0AtUxuQ5EJuOUDubjQzm64HqKo9+8g1L5J8Cvg88J521+HAw6rqkf2lGqYkp3DLz/oeNOPTrgGoqoP7yDVknl80T5J8jeaK65p2exfg41V1j36TTYZlagPaBcbWy7E707WuhQuTXOj4nclL8tANvV5VZ8wqy7zw/KJ5kuTsqrrfyHaAs0b3LWXO5tuA0ZNZknvTjGcA+EJVfaWfVHPlk+1g3JPb7SfQDFrUhI2WpSS/Diyc4M6qqu/3k2rYPL9oHiR5XPvtqiSn0pzPi+bxYGf3FmzCvDI1hiQvBJ4DfLjd9Via5wq9qb9Uw5XkOpp/bAG2pF2Nm2aR2f+pqq37yjZ0SZ4EvA74HM37/2DgJVX1wT5zDZnnFw1ZkhM39PpQHvRtmRpDkguAfavqJ+32lsCZjmnQ0CT5CvCohatRSZYDn66qe/ebbLg8v0hLn7f5xhNu+Xyym9p9mrIkewI7MfJZraoPr/d/oK42Weu23tX42Klp8/yiwUuyM/B8bn0+H8TkFsvUeE4EvpzkI+32HwIn9JhnLiQ5AdgTuIibb/UVN98O0eT9e5LTgPe1208GBjF1eRHz/KJ58FHgHTSrnv9iI8cuOd7mG1OSvWnWO4JmgOh5feaZB0kurqo9+s4xb9oBo6Of9Y9s6Hh15/lFQ5fky0Ne1d8yNYYk766qp25snyYryTuA11fVxX1nmRdJXlNVL93YPk2O5xfNgyRPAXYFPkm7lhpAVZ3bW6gJ8jbfeH53dCPJMuC+PWWZJ/8CnJnkv2n+8QUoB+ZO1aOAtYvTo9exT5Pj+UXz4F7AU4H9ueWwjf17SzRBlqkNSPIyYOEp19cu7AZuoHkAr6brHTT/+C5kgPfYF5Mkf0bzFPe7t7PLFmwFfLGfVMPm+UVz5onA3Yf0PL5R3uYbQ5JXVdXLNvD671bVRbPMNA+SnFlV+/adYx4kuROwDc2DjY8aeem6qvrhyHHbVNU1s843ZJ5fNA+SfBQ4YqiLAFumJiDJuVW1d985hibJm4E708z+GL3H7my+nvhZnz3fcw1Bks/RzM4+m1uez10aQb/kmjDTsQXNP7rfG9nn0gj98rM+e77nGoJj+g4wTZapyfDy3hQM5TEDA+NnffZ8z7XkVdUZ7cO9d62qTye5A7Cs71yT4srGWrSS7Jbk9CRfbbf3TPJXfeeSJP1qkjwH+CDw1nbXtjQLeQ6CZWoj0th+I4cNcnbCIvA24GXAzwGq6gLg0F4TyVtOs+f5RUPwPGA/4FqAqvo6cLdeE02Qt/k2oqoqyak0a2Ss75gHzDDSPLlDVZ2V3OLn9419hRm6dn2ji6rqHhs47BGzyjN07arn67WwmKHnFw3E9VV1w8L5PMmmDOgWtmVqPOcmuV9Vnd13kDnzgyS70P6DS/IE4Lv9RhquqropyeokO1TVt9ZzzA/XtV+3yevbP28PrAC+QnPlb09gFeCyIBqSM5IsrKv2KJp17U7pOdPEuDTCGJJ8Dfht4JvAT3Al7plIcneaxQsfCFwDXAYcXlXf7DXYgCX5PLAXcBbNZx0YzvTlxSjJh4FjqurCdvuewCuq6gn9JpMmJ8kmwLNpZmcHOA14ew2khFimxtDOQLgVf6jPRpItgU2q6rq19j+9qt7VU6xBSvLQde2vqjNmnWVeJLmoqtZ+pMyt9klDluRDVfX4vnPcVpapX0GSu9FckgdgfbdCNBsuZqghSPI+mquA72l3HQ7csaoO6y+VNFtJzquqvfrOcVs5m28MSQ5O8nWa20xnAJcDn+g1lMCZZROX5AFJzk7yP0luSHLTyHPjNB3PBC4CXth+Xdzuk+bJkr6y4wD08bwSeADw6araK8nDgT/qOZOW+D++ReqfaJaf+ADNoOinAbv1mmjgqupnSd4CnFpVq/vOI+lX55Wp8fy8qq4GNkmySVV9luYHjfrllakpqKo1wLKquqmqTgQO6DvTkCU5GDgf+Pd2+z5JVvabSpq5JX0+98rUeH6U5I7A54F/TfJ9RmY6qTdf7DvAAP00yebA+UleS7MUhb90TdcxwD7A5wCq6vwkO/eaSJqgdg27f6mqwzdw2EtnlWcaHIA+hnY22c9omvPhwJ2Af22vVmlKktwOeDywEyPFv6qO7SvT0LUzV78HbA68iOaz/ub2apWmIMl/VtUDRgfgJrnApVc0JEn+A9i/qga5or9XpsZQVaNXoZyKPzsfA34MnANc33OWuTCy3MfPgL/uM8scuSjJU4BlSXYFXgB8qedM0qRdCnyxvYU9uobdG/qLNDmWqTEkeRzwGprnCIWbF+3cutdgw7ddVTleZ4aS7Ae8AtiRW14NvHtfmebA84G/pPmF4X00ixm+stdE0uR9o/3aBNiq5ywT522+MSRZAxxUVZf0nWWeJDkeeNPCytCavna1/xfRXA28aWG/t7Qlaf28MjWe71mkevEg4BlJLqP5rd3H+Ezfj6vKNdRmKMluwP/l1mMD9+8rkzRpSZYDfwH8Lrdc/HoQn3OvTG1Ae3sP4KHAbwAfZWTsTlV9uI9c88LH+MxOkoWV5J8ELAM+zC0/6+f2kWseJPkK8BZufTXwnN5CSROW5JPA+2l+cfhT4OnAVVW1pGfxLbBMbUCSEzfwclXVs2YWZk4leRCwa1Wd2P5mc8equqzvXEOT5LMbeLmG8tvjYpTknKq6b985pGla+JyPzlRNcnZV3a/vbJPgbb4NqCof6dCjJMfQLI66O3AisBnN88v26zPXEFXVw/vOMMdOSfJc4CPc8mrgD/uLJE3cz9s/v5vkQOA7wF16zDNRLsY3hiTvSnLnke1tkpzQZ6Y58VjgYNpptFX1HQY4C2QxSfJ36/is/02fmebA04GX0CyHcE77tarXRNLk/U2SOwF/TnOr7+00k10Gwdt8Y1jX06yX+hOul4IkZ1XVPknOraq928VTz3QA+vSs57N+blXtvb7/jSTNO2/zjWeTJNtU1TUASe6C790snJzkrcCdkzwHeBbNbzOanmVJbldV1wMk2QK4Xc+ZBinJ/lX1mZGJLrfgBBcNSTvm9TncetbqIMYeWwjG83rgzCQfaLefCPxtj3nmxeuBRwLX0oybOprm+Yiann8FTh+ZfPFMXPV/Wh4KfAY4aB2vFc2MSmkoPgZ8Afg0I7NWh8LbfGNKsgewMKPpM1V18chrv7xqpclJcsLoby3tw6Y/VlWP6DHW4CU5gKbEAnyqqk7rM4+kpS/J+VV1n75zTItlagIcUzIdSV4J/FpVPTfJNsDHgbdV1YaWrNAUJTmzqvbtO8fQtLOb1l7M0Ad6azDaiSxfqqpT+84yDZapCXAw+vQkeS2wNXBf4NVV9aGeI801P+uTl+QtwB2Ah9OMCXwCcFZVPbvXYNIEJLmO5rZ1gC1plv/4OQN7xq1lagK8MjVZaw3IDfD/gLOAfwcH5vbJz/rkLSxiOPLnHYFPVNWD+84maTwOQNditPaA3PNoFuw8CAfmanh+1v750yS/BVwN/GaPeaSJS/JYmvHGP2637ww8rKo+2m+yybBMTUb6DjAkrjy/qPlZn7xT2h8srwPOpfmF4W39RpIm7piq+sjCRlX9qH3KxSDKlCugjyHJLklu137/sCQvGF0lGnB22RQk2S7JR5J8v/36UJLt+s41557ad4AhSbIJcHpV/agdD7gjcI+qOrrnaNKkratvDOaCjmVqPB8Cbkry28DxwPbAexde9BlaU3MisBL4rfbrlHafJizJdUmuXd/XwnFV9dU+cw5NVf0COG5k+/qF2yDSwKxK8ob24sQuSd5A8+ikQbBMjecXVXUjzbPi3lRVL8ExDbOwvKpOrKob2693Asv7DjVEVbVVO6vmjcBRwLbAdsBLgf+vz2xz4PQkj0/iLVQN2fOBG4D3AyfRjBV8Xq+JJsjZfGNI8mWaHyh/CRxUVZcl+WpV3bPnaIOW5HSaK1Hva3cdBjzTRTunJ8lXqureG9unyWmnjm8J3EjzA2ZQU8alcSR5U1U9v+8ct5VXpsbzTGBf4G/bIrUz8O6eM82DZwFPAv4b+C7N+jvP6DPQHPhJksOTLEuySZLDgZ/0HWrI2quCm1TV5lW19chVQmme7Nd3gC68MjWmJJsDu7Wbq6vq533mmQdJ9quqL25snyYnyU40t/r2o5lV9kXg/1TV5f2lGrYkp699tXVd+6QhW+pr2A1mJP00JXkYzcNeL6e5BL99kqdXlQ/dna43AWv/41rXPk1IW5oO6TvHPEhye5qVz+/aPi5pYczU1jRj1iQtEZap8bwe+L2qWg2QZDeacTz37TXVQCXZF3ggsDzJi0de2hpY1k+qYUvyF1X12iRvorkidQtV9YIeYg3dnwD/h2am6rkj+68F/qmXRFJ/lvQEDMvUeDZbKFIAVfVfSTbrM9DAbQ7ckebzudXI/mtpxk1p8i5p/1zVa4o5UlVvBN6Y5PlV9aa+80g9e2PfAbpwzNQYkpwA/AJ4T7vrcGBZVT2rv1TDl2THqvrmBl5f0rM/Fpsky4DXVNX/7TvLPEmyJfAiYIeqOiLJrsDuVfVvPUeTJibJCpoZ8TvS/KK8MGt1z16DTYhlagzt6ufPAx7U7voC8Oaqur6/VFrqAxYXoyRnVtW+feeYJ0neT7N44dOq6p5J7gB8qaru03M0aWKSrAZeAlxIc3ECgA39wryUeJtvDFV1fZJ/Ak6n+RCsrqobeo4lTcP5SVYCH2BkSYSq8uHS07NLVT05yWEAVfVTF/DUAF1VVSv7DjEtlqkxJDkQeAvwDZpLkzsn+ZOq+kS/yaSJuz1wNbD/yL4CLFPTc0OSLWgH/ifZBfCqt4bmmCRvp7ko8cvP91B+UbNMjef1wMOrag388mT3ccAy1S9/e5+wqnpm3xnm0DHAv9MsufKvNGt8PaPXRNLkPRO4B7AZN9/mG8wvapap8Vy3UKRalwLX9RVGv7SkZ38sRknuTvO+PoDmRHcmzaKdl/UabKCSbAJsAzyO5j0P8MKq+kGvwaTJu19V7d53iGlxAPoYkvwzzQyEk2l+wDwR+BbwaRjOZcrFZuizPxajJP8JHMfNz0M8FHh+Vd2/v1TDlmRVVa3oO4c0TUlOBF5XVRf3nWUaLFNjaD8E61MukTAdQ5/9sRgluWDtsuqDjqcryauBHwDv55aD/n/YWyhpwpJcAuwCXEYzZmpQvxxbpiYgycuq6lV95xiaJP9RVQ/a+JHqKsld2m9fClwDnERzFfbJwDZV9bK+sg1dkstY96rzd+8hjjQVSXZc1/6h/HJsmZoA1zuajiSPAA5joLM/FpORH+jrGtRf/mCfnnYm33Np1rErmnXs3lJV/9trMGnCktwbeHC7+YWq+kqfeSbJAeiT4ayy6Rj07I/FpKp27jvDHHsXzaOS/rHdfkq770m9JZImLMkLgedw8/n7PUmOH8qjlLwyNQFemZqOJKuHPPtjMUpyDvAO4L1V9aO+88yDJBdX1R4b2yctZUkuAPatqp+021sCZw5lzNQmfQcYCK9MTceXkvgDZbaeDGwLrEpyUpLfdzXuqTs3yQMWNpLcHx84reEJcNPI9k0M6GenV6YmIMnLq+rv+s4xNEOf/bGYtesfPQb4Z5qT3onAG51hNnnt53x3muVWAHYAVgM34uddA5HkxcDTgY+0u/4QeFdV/UN/qSbHMrUBSd7EOmbZLKiqF8wwztwZ+uyPxSrJnsCzgEcDpwH/SjM4+qk+fHfy1vc5X+DnXUORZG+acwk0A9DP6zPPJDkAfcMWLrXvB+xBsw4MNIt2DnLhscWkqr455Nkfi1E7ZupHwNuBl1bVwizKLyfZr79kw2VZ0jxI8u6qeipw7jr2LXlemRpDuyr0g6rqxnZ7M5of7A/Y8P9SXaxj9sdjgcHM/liM2jFqe3HzqvMAVNWxvYWStOStPVEryTLgwqFMtPDK1Hi2AbYGFsaL3LHdp+l6NnD/kdkfr6F5VpxlanreQHNl6lxG1vaSpNsiycuAlwNbJLl2YTdwA3B8b8EmzDI1nlcD5yX5LM2H4CHAK3pNNB8GPftjkdquqg7oO4SkYWifDvKqJK8a8pMULFNjqKoTk3wCWHjY60ur6r/7zDQnTqQZqzM6++OEHvPMgy8luVdVXdh3EEmD8m9JtqyqnyT5I2BvmhnCgxgz6JipDUhyj6r6WjsD4Vaq6tx17dfkDHn2x2KS5EKamaubArsCl+JyFJImpF20897AnsA7aSa5PKmqHtpnrkmxTG1Au9T9Ee3tvbVVVe0/81BzZF0zPYY0+2MxcXq+pGlaGICe5Gjgyqp6x5CeHuJtvg1oi9QmwF9V1Rf7zjOHfnd0o539cd+esgyaZUnSlF3XDkb/I+Ah7c/WzXrONDE+TmYjquoXwD/1nWOeJHlZkuuAPZNc235dB3wf+FjP8SRJv7on0wwdeHY75ng74HX9Rpocb/ONIcnf00zJ/3D5hs3M0Gd/SJKGwTI1hvaqyJY0U/P/l5sH5W7da7CBa1fcPn+osz8kaV60P0cXCsfmNLf4/qeq7tRfqsnxNt8Yqmqrqtqkqjarqq3bbYvU9P0z8NP2kTJ/DnwD+Jd+I0mSflULPzfbn51bAI+nOccPglemxpTkcTRT9Itmiv5He440eEOf/SFJ8yzJeVW1V985JsHZfGNI8mbgt4H3tbv+NMmjqup5PcaaB4Oe/SFJ86K9ILFgE2AF8LOe4kycV6bGkORrwO8sDD5vf6hfVFW/02+yYUvyG8BTgLOr6gtJdgAeVlXe6pOkJSTJiSObNwKX0zy4/qp+Ek2WV6bGswbYAVgY+Lx9u09T1E6ffcPI9rdwzJQkLUWbAC+sqh8BJNkGeD3wrF5TTYhlagOSnEIzRmor4JIkZ7Xb9wfO6jPbPBj67A9JmiN7LhQpgKq6JskgxkuBZWpj/r7vAPOsqrZa+D5JgEOAB/SXSJJ0G22SZJuqugYgyV0YUAdxzJSWlCHN/pCkeZHkacDLgQ+0u54I/G1Vvbu/VJNjmdqAJP9RVQ9a63YTuGjnTKxn9sdDq2rfniJJkm6jJHsA+7ebn6mqi/vMM0mWKS1aQ5/9IUkahsHcr5yWJMtolkG4R99Z5tCgZ39IkobBx8lsRFXdBKxu1zjSbN1q9gfgeClJ0qLilanxbANc1C6N8JOFnVV1cH+R5sKgZ39IkobBH0zjuT3wmJHtAK/pKcs8eT1wZpJbzP7oMY8kSbdimRrPplV1xuiOJFv0FWZeVNW/JFnFzbM/Hjek2R+SpGFwNt8GJPkz4LnA3YFvjLy0FfDFqvqjXoJJkqRFwzK1AUnuRDNe6lXAUSMvXVdVP+wnlSRJWkwsU5IkSR24NIIkSVIHlilJkqQOLFOSJEkdWKYkSZI6sExJkiR18P8DBFALnoL5PTkAAAAASUVORK5CYII=\n",
            "text/plain": [
              "<Figure size 720x504 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": [],
            "needs_background": "light"
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "iEoLsCNbwNRA"
      },
      "source": [
        "Nice! Based on F1-scores, it looks like our tribrid embedding model performs the best by a fair margin.\n",
        "\n",
        "Though, in comparison to the results reported in Table 3 of the [*PubMed 200k RCT:\n",
        "a Dataset for Sequential Sentence Classification in Medical Abstracts*](https://arxiv.org/pdf/1710.06071.pdf) paper, our model's F1-score is still underperforming (the authors model achieves an F1-score of 90.0 on the 20k RCT dataset versus our F1-score of ~82.6).\n",
        "\n",
        "There are some things to note about this difference:\n",
        "* Our models (with an exception for the baseline) have been trained on ~18,000 (10% of batches) samples of sequences and labels rather than the full ~180,000 in the 20k RCT dataset.\n",
        "  * This is often the case in machine learning experiments though, make sure training works on a smaller number of samples, then upscale when needed (an extension to this project will be training a model on the full dataset).\n",
        "* Our model's prediction performance levels have been evaluated on the validation dataset not the test dataset (we'll evaluate our best model on the test dataset shortly)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "pk5rMP0rarWG"
      },
      "source": [
        "## Save and load best performing model\n",
        "\n",
        "Since we've been through a fair few experiments, it's a good idea to save our best performing model so we can reuse it without having to retrain it.\n",
        "\n",
        "We can save our best performing model by calling the [`save()`](https://www.tensorflow.org/guide/keras/save_and_serialize#the_short_answer_to_saving_loading) method on it."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "HRalPoXEi0Es",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "1ed265b6-964e-49b5-e461-53a007579a6d"
      },
      "source": [
        "# Save best performing model to SavedModel format (default)\n",
        "model_5.save(\"skimlit_tribrid_model\") # model will be saved to path specified by string"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:11 out of the last 11 calls to <function recreate_function.<locals>.restored_function_body at 0x7fe72bef43b0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:11 out of the last 11 calls to <function recreate_function.<locals>.restored_function_body at 0x7fe72bef43b0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.\n",
            "WARNING:absl:Found untraced functions such as lstm_cell_10_layer_call_fn, lstm_cell_10_layer_call_and_return_conditional_losses, lstm_cell_11_layer_call_fn, lstm_cell_11_layer_call_and_return_conditional_losses, lstm_cell_10_layer_call_fn while saving (showing 5 of 10). These functions will not be directly callable after loading.\n",
            "WARNING:absl:Found untraced functions such as lstm_cell_10_layer_call_fn, lstm_cell_10_layer_call_and_return_conditional_losses, lstm_cell_11_layer_call_fn, lstm_cell_11_layer_call_and_return_conditional_losses, lstm_cell_10_layer_call_fn while saving (showing 5 of 10). These functions will not be directly callable after loading.\n"
          ],
          "name": "stderr"
        },
        {
          "output_type": "stream",
          "text": [
            "INFO:tensorflow:Assets written to: skimlit_tribrid_model/assets\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "stream",
          "text": [
            "INFO:tensorflow:Assets written to: skimlit_tribrid_model/assets\n"
          ],
          "name": "stderr"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "RMzS3dWPx9xu"
      },
      "source": [
        "Optional: If you're using Google Colab, you might want to copy your saved model to Google Drive (or [download it](https://colab.research.google.com/notebooks/io.ipynb#scrollTo=hauvGV4hV-Mh)) for more permanent storage (Google Colab files disappear after you disconnect)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Mgsma17oUtAE"
      },
      "source": [
        "# Example of copying saved model from Google Colab to Drive (requires Google Drive to be mounted)\n",
        "# !cp skim_lit_best_model -r /content/drive/MyDrive/tensorflow_course/skim_lit"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5Go3DCssvA1o"
      },
      "source": [
        "\n",
        "Like all good cooking shows, we've got a pretrained model (exactly the same kind of model we built for `model_5` [saved and stored on Google Storage](https://storage.googleapis.com/ztm_tf_course/skimlit/skimlit_best_model.zip)). \n",
        "\n",
        "So to make sure we're all using the same model for evaluation, we'll download it and load it in. \n",
        "\n",
        "And when loading in our model, since it uses a couple of [custom objects](https://www.tensorflow.org/guide/keras/save_and_serialize#custom_objects) (our TensorFlow Hub layer and `TextVectorization` layer), we'll have to load it in by specifying them in the `custom_objects` parameter of [`tf.keras.models.load_model()`](https://www.tensorflow.org/api_docs/python/tf/keras/models/load_model). \n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "7dQESoCSuUnK",
        "outputId": "1650ba2b-b90d-45b3-f135-88d1911e41ee"
      },
      "source": [
        "# Download pretrained model from Google Storage\n",
        "!wget https://storage.googleapis.com/ztm_tf_course/skimlit/skimlit_tribrid_model.zip\n",
        "!mkdir skimlit_gs_model\n",
        "!unzip skimlit_tribrid_model.zip -d skimlit_gs_model"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "--2021-04-02 04:35:03--  https://storage.googleapis.com/ztm_tf_course/skimlit/skimlit_tribrid_model.zip\n",
            "Resolving storage.googleapis.com (storage.googleapis.com)... 74.125.20.128, 74.125.142.128, 74.125.195.128, ...\n",
            "Connecting to storage.googleapis.com (storage.googleapis.com)|74.125.20.128|:443... connected.\n",
            "HTTP request sent, awaiting response... 200 OK\n",
            "Length: 962957902 (918M) [application/zip]\n",
            "Saving to: ‘skimlit_tribrid_model.zip.1’\n",
            "\n",
            "skimlit_tribrid_mod 100%[===================>] 918.35M   234MB/s    in 4.0s    \n",
            "\n",
            "2021-04-02 04:35:07 (231 MB/s) - ‘skimlit_tribrid_model.zip.1’ saved [962957902/962957902]\n",
            "\n",
            "Archive:  skimlit_tribrid_model.zip\n",
            "   creating: skimlit_gs_model/skimlit_tribrid_model/\n",
            "   creating: skimlit_gs_model/skimlit_tribrid_model/assets/\n",
            "   creating: skimlit_gs_model/skimlit_tribrid_model/variables/\n",
            "  inflating: skimlit_gs_model/skimlit_tribrid_model/variables/variables.index  \n",
            "  inflating: skimlit_gs_model/skimlit_tribrid_model/variables/variables.data-00000-of-00001  \n",
            "  inflating: skimlit_gs_model/skimlit_tribrid_model/saved_model.pb  \n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "mDRneseeZSRY",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "613e0a56-c274-447d-b292-2cf0c4f2bd39"
      },
      "source": [
        "# Import TensorFlow model dependencies (if needed) - https://github.com/tensorflow/tensorflow/issues/38250 \n",
        "import tensorflow_hub as hub\n",
        "import tensorflow as tf\n",
        "from tensorflow.keras.layers.experimental.preprocessing import TextVectorization\n",
        "\n",
        "model_path = \"skimlit_gs_model/skimlit_tribrid_model\"\n",
        "\n",
        "# Load downloaded model from Google Storage\n",
        "loaded_model = tf.keras.models.load_model(model_path,\n",
        "                                          custom_objects={\"TextVectorization\": TextVectorization, # required for char vectorization\n",
        "                                                          \"KerasLayer\": hub.KerasLayer}) # required for token embedding"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:11 out of the last 11 calls to <function recreate_function.<locals>.restored_function_body at 0x7fe69c6e4c20> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:11 out of the last 11 calls to <function recreate_function.<locals>.restored_function_body at 0x7fe69c6e4c20> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.\n"
          ],
          "name": "stderr"
        },
        {
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:11 out of the last 11 calls to <function recreate_function.<locals>.restored_function_body at 0x7fe69c61def0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:11 out of the last 11 calls to <function recreate_function.<locals>.restored_function_body at 0x7fe69c61def0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.\n"
          ],
          "name": "stderr"
        },
        {
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:11 out of the last 11 calls to <function recreate_function.<locals>.restored_function_body at 0x7fe6a58217a0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:11 out of the last 11 calls to <function recreate_function.<locals>.restored_function_body at 0x7fe6a58217a0> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.\n"
          ],
          "name": "stderr"
        },
        {
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:11 out of the last 11 calls to <function recreate_function.<locals>.restored_function_body at 0x7fe72a2be050> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:11 out of the last 11 calls to <function recreate_function.<locals>.restored_function_body at 0x7fe72a2be050> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for  more details.\n"
          ],
          "name": "stderr"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "GOY7As4Dxdn_"
      },
      "source": [
        "### Make predictions and evalaute them against the truth labels\n",
        "\n",
        "To make sure our model saved and loaded correctly, let's make predictions with it, evaluate them and then compare them to the prediction results we calculated earlier."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "WmLdyobSv95I",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "b52b78e1-ade6-4402-df89-2f35085b014a"
      },
      "source": [
        "# Make predictions with the loaded model on the validation set\n",
        "loaded_pred_probs = loaded_model.predict(val_pos_char_token_dataset, verbose=1)\n",
        "loaded_preds = tf.argmax(loaded_pred_probs, axis=1)\n",
        "loaded_preds[:10]"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "945/945 [==============================] - 23s 21ms/step\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<tf.Tensor: shape=(10,), dtype=int64, numpy=array([0, 0, 3, 2, 2, 4, 4, 4, 4, 1])>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 207
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "jS4XMK6yxEn0",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "0071b943-9d64-4d5c-c3e5-816034b89753"
      },
      "source": [
        "# Evaluate loaded model's predictions\n",
        "loaded_model_results = calculate_results(val_labels_encoded,\n",
        "                                         loaded_preds)\n",
        "loaded_model_results"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "{'accuracy': 82.83132530120481,\n",
              " 'f1': 0.8272937671199255,\n",
              " 'precision': 0.8268115620164092,\n",
              " 'recall': 0.8283132530120482}"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 208
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "a_zJXe1v1Evs"
      },
      "source": [
        "Now let's compare our loaded model's predictions with the prediction results we obtained before saving our model."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "TNDhkLYzznXn"
      },
      "source": [
        "# Compare loaded model results with original trained model results (should return no errors)\n",
        "assert model_5_results == loaded_model_results"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "C5EMfCId1WKr"
      },
      "source": [
        "It's worth noting that loading in a SavedModel unfreezes all layers (makes them all trainable). So if you want to freeze any layers, you'll have to set their trainable attribute to `False`."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "FZEk80xiqNLT",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "0b5ea07c-4d5b-4969-cb7a-d262f8d1bbb7"
      },
      "source": [
        "# Check loaded model summary (note the number of trainable parameters)\n",
        "loaded_model.summary()"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Model: \"model_18\"\n",
            "__________________________________________________________________________________________________\n",
            "Layer (type)                    Output Shape         Param #     Connected to                     \n",
            "==================================================================================================\n",
            "char_inputs (InputLayer)        [(None, 1)]          0                                            \n",
            "__________________________________________________________________________________________________\n",
            "char_vectorizer (TextVectorizat (None, 290)          0           char_inputs[0][0]                \n",
            "__________________________________________________________________________________________________\n",
            "token_inputs (InputLayer)       [(None,)]            0                                            \n",
            "__________________________________________________________________________________________________\n",
            "char_embed (Embedding)          (None, 290, 25)      1750        char_vectorizer[0][0]            \n",
            "__________________________________________________________________________________________________\n",
            "universal_sentence_encoder (Ker (None, 512)          256797824   token_inputs[0][0]               \n",
            "__________________________________________________________________________________________________\n",
            "bidirectional_3 (Bidirectional) (None, 64)           14848       char_embed[0][0]                 \n",
            "__________________________________________________________________________________________________\n",
            "token_char_hybrid_embedding (Co (None, 576)          0           universal_sentence_encoder[0][0] \n",
            "                                                                 bidirectional_3[0][0]            \n",
            "__________________________________________________________________________________________________\n",
            "line_number_input (InputLayer)  [(None, 15)]         0                                            \n",
            "__________________________________________________________________________________________________\n",
            "total_lines_input (InputLayer)  [(None, 20)]         0                                            \n",
            "__________________________________________________________________________________________________\n",
            "dense_18 (Dense)                (None, 256)          147712      token_char_hybrid_embedding[0][0]\n",
            "__________________________________________________________________________________________________\n",
            "dense_16 (Dense)                (None, 32)           512         line_number_input[0][0]          \n",
            "__________________________________________________________________________________________________\n",
            "dense_17 (Dense)                (None, 32)           672         total_lines_input[0][0]          \n",
            "__________________________________________________________________________________________________\n",
            "dropout_4 (Dropout)             (None, 256)          0           dense_18[0][0]                   \n",
            "__________________________________________________________________________________________________\n",
            "token_char_positional_embedding (None, 320)          0           dense_16[0][0]                   \n",
            "                                                                 dense_17[0][0]                   \n",
            "                                                                 dropout_4[0][0]                  \n",
            "__________________________________________________________________________________________________\n",
            "output_layer (Dense)            (None, 5)            1605        token_char_positional_embedding[0\n",
            "==================================================================================================\n",
            "Total params: 256,964,923\n",
            "Trainable params: 256,964,923\n",
            "Non-trainable params: 0\n",
            "__________________________________________________________________________________________________\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5uq0MFPiaoUb"
      },
      "source": [
        "## Evaluate model on test dataset\n",
        "\n",
        "To make our model's performance more comparable with the results reported in Table 3 of the [*PubMed 200k RCT:\n",
        "a Dataset for Sequential Sentence Classification in Medical Abstracts*](https://arxiv.org/pdf/1710.06071.pdf) paper, let's make predictions on the test dataset and evaluate them."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "mkFb3giT2FYW",
        "outputId": "4b53bb9b-29aa-4359-c28c-62f05c621bd6"
      },
      "source": [
        "# Create test dataset batch and prefetched\n",
        "test_pos_char_token_data = tf.data.Dataset.from_tensor_slices((test_line_numbers_one_hot,\n",
        "                                                               test_total_lines_one_hot,\n",
        "                                                               test_sentences,\n",
        "                                                               test_chars))\n",
        "test_pos_char_token_labels = tf.data.Dataset.from_tensor_slices(test_labels_one_hot)\n",
        "test_pos_char_token_dataset = tf.data.Dataset.zip((test_pos_char_token_data, test_pos_char_token_labels))\n",
        "test_pos_char_token_dataset = test_pos_char_token_dataset.batch(32).prefetch(tf.data.AUTOTUNE)\n",
        "\n",
        "# Check shapes\n",
        "test_pos_char_token_dataset"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<PrefetchDataset shapes: (((None, 15), (None, 20), (None,), (None,)), (None, 5)), types: ((tf.float32, tf.float32, tf.string, tf.string), tf.float64)>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 211
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "BpoQj-PexFf9",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "56496e79-594f-4dd0-f9d9-29cfad135a06"
      },
      "source": [
        "# Make predictions on the test dataset\n",
        "test_pred_probs = loaded_model.predict(test_pos_char_token_dataset,\n",
        "                                       verbose=1)\n",
        "test_preds = tf.argmax(test_pred_probs, axis=1)\n",
        "test_preds[:10]"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "942/942 [==============================] - 20s 21ms/step\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<tf.Tensor: shape=(10,), dtype=int64, numpy=array([3, 3, 2, 2, 4, 4, 4, 1, 4, 0])>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 212
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "avvjksEqxe_0",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "bfdcd965-7852-487c-f495-e38db27eee60"
      },
      "source": [
        "# Evaluate loaded model test predictions\n",
        "loaded_model_test_results = calculate_results(y_true=test_labels_encoded,\n",
        "                                              y_pred=test_preds)\n",
        "loaded_model_test_results"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "{'accuracy': 82.3859299817488,\n",
              " 'f1': 0.8228365555962726,\n",
              " 'precision': 0.8224125471470242,\n",
              " 'recall': 0.823859299817488}"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 213
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "eupgOniJ3rLr"
      },
      "source": [
        "It seems our best model (so far) still has some ways to go to match the performance of the results in the paper (their model gets 90.0 F1-score on the test dataset, where as ours gets ~82.1 F1-score).\n",
        "\n",
        "However, as we discussed before our model has only been trained on 20,000 out of the total ~180,000 sequences in the RCT 20k dataset. We also haven't fine-tuned our pretrained embeddings (the paper fine-tunes GloVe embeddings). So there's a couple of extensions we could try to improve our results."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "B8orhq8dPAuW"
      },
      "source": [
        "## Find most wrong\n",
        "\n",
        "One of the best ways to investigate where your model is going wrong (or potentially where your data is wrong) is to visualize the \"most wrong\" predictions.\n",
        "\n",
        "The most wrong predictions are samples where the model has made a prediction with a high probability but has gotten it wrong (the model's prediction disagreess with the ground truth label).\n",
        "\n",
        "Looking at the most wrong predictions can give us valuable information on how to improve further models or fix the labels in our data.\n",
        "\n",
        "Let's write some code to help us visualize the most wrong predictions from the test dataset.\n",
        "\n",
        "First we'll convert all of our integer-based test predictions into their string-based class names."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "7yI34yymyycT",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "a542897c-0964-4e5e-a6da-a3941e844af5"
      },
      "source": [
        "%%time\n",
        "# Get list of class names of test predictions\n",
        "test_pred_classes = [label_encoder.classes_[pred] for pred in test_preds]\n",
        "test_pred_classes"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "CPU times: user 9.86 s, sys: 874 ms, total: 10.7 s\n",
            "Wall time: 8.81 s\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0B41eg6O6DbQ"
      },
      "source": [
        "Now we'll enrich our test DataFame with a few values:\n",
        "* A `\"prediction\"` (string) column containing our model's prediction for a given sample.\n",
        "* A `\"pred_prob\"` (float) column containing the model's maximum prediction probabiliy for a given sample.\n",
        "* A `\"correct\"` (bool) column to indicate whether or not the model's prediction matches the sample's target label."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "au11pLUEPCaj",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 639
        },
        "outputId": "46d39031-2b15-4144-ad67-c63f225cc522"
      },
      "source": [
        "# Create prediction-enriched test dataframe\n",
        "test_df[\"prediction\"] = test_pred_classes # create column with test prediction class names\n",
        "test_df[\"pred_prob\"] = tf.reduce_max(test_pred_probs, axis=1).numpy() # get the maximum prediction probability\n",
        "test_df[\"correct\"] = test_df[\"prediction\"] == test_df[\"target\"] # create binary column for whether the prediction is right or not\n",
        "test_df.head(20)"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>target</th>\n",
              "      <th>text</th>\n",
              "      <th>line_number</th>\n",
              "      <th>total_lines</th>\n",
              "      <th>prediction</th>\n",
              "      <th>pred_prob</th>\n",
              "      <th>correct</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>BACKGROUND</td>\n",
              "      <td>this study analyzed liver function abnormaliti...</td>\n",
              "      <td>0</td>\n",
              "      <td>8</td>\n",
              "      <td>OBJECTIVE</td>\n",
              "      <td>0.530844</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>a post hoc analysis was conducted with the use...</td>\n",
              "      <td>1</td>\n",
              "      <td>8</td>\n",
              "      <td>OBJECTIVE</td>\n",
              "      <td>0.319480</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>liver function tests ( lfts ) were measured at...</td>\n",
              "      <td>2</td>\n",
              "      <td>8</td>\n",
              "      <td>METHODS</td>\n",
              "      <td>0.739303</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>survival analyses were used to assess the asso...</td>\n",
              "      <td>3</td>\n",
              "      <td>8</td>\n",
              "      <td>METHODS</td>\n",
              "      <td>0.606798</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>the percentage of patients with abnormal lfts ...</td>\n",
              "      <td>4</td>\n",
              "      <td>8</td>\n",
              "      <td>RESULTS</td>\n",
              "      <td>0.742463</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>5</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>when mean hemodynamic profiles were compared i...</td>\n",
              "      <td>5</td>\n",
              "      <td>8</td>\n",
              "      <td>RESULTS</td>\n",
              "      <td>0.899310</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>6</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>multivariable analyses revealed that patients ...</td>\n",
              "      <td>6</td>\n",
              "      <td>8</td>\n",
              "      <td>RESULTS</td>\n",
              "      <td>0.525651</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>7</th>\n",
              "      <td>CONCLUSIONS</td>\n",
              "      <td>abnormal lfts are common in the adhf populatio...</td>\n",
              "      <td>7</td>\n",
              "      <td>8</td>\n",
              "      <td>CONCLUSIONS</td>\n",
              "      <td>0.449698</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8</th>\n",
              "      <td>CONCLUSIONS</td>\n",
              "      <td>elevated meld-xi scores are associated with po...</td>\n",
              "      <td>8</td>\n",
              "      <td>8</td>\n",
              "      <td>RESULTS</td>\n",
              "      <td>0.523380</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>9</th>\n",
              "      <td>BACKGROUND</td>\n",
              "      <td>minimally invasive endovascular aneurysm repai...</td>\n",
              "      <td>0</td>\n",
              "      <td>12</td>\n",
              "      <td>BACKGROUND</td>\n",
              "      <td>0.544405</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>10</th>\n",
              "      <td>BACKGROUND</td>\n",
              "      <td>the aim of this study was to analyse the cost-...</td>\n",
              "      <td>1</td>\n",
              "      <td>12</td>\n",
              "      <td>OBJECTIVE</td>\n",
              "      <td>0.486076</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>11</th>\n",
              "      <td>METHODS</td>\n",
              "      <td>resource use was determined from the amsterdam...</td>\n",
              "      <td>2</td>\n",
              "      <td>12</td>\n",
              "      <td>METHODS</td>\n",
              "      <td>0.669382</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>12</th>\n",
              "      <td>METHODS</td>\n",
              "      <td>the analysis was performed from a provider per...</td>\n",
              "      <td>3</td>\n",
              "      <td>12</td>\n",
              "      <td>METHODS</td>\n",
              "      <td>0.859642</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>13</th>\n",
              "      <td>METHODS</td>\n",
              "      <td>all costs were calculated as if all patients h...</td>\n",
              "      <td>4</td>\n",
              "      <td>12</td>\n",
              "      <td>METHODS</td>\n",
              "      <td>0.542948</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>14</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>a total of @ patients were randomized .</td>\n",
              "      <td>5</td>\n",
              "      <td>12</td>\n",
              "      <td>RESULTS</td>\n",
              "      <td>0.702793</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>15</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>the @-day mortality rate was @ per cent after ...</td>\n",
              "      <td>6</td>\n",
              "      <td>12</td>\n",
              "      <td>RESULTS</td>\n",
              "      <td>0.596260</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>16</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>at @months , the total mortality rate for evar...</td>\n",
              "      <td>7</td>\n",
              "      <td>12</td>\n",
              "      <td>RESULTS</td>\n",
              "      <td>0.884152</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>17</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>the mean cost difference between evar and or w...</td>\n",
              "      <td>8</td>\n",
              "      <td>12</td>\n",
              "      <td>RESULTS</td>\n",
              "      <td>0.797754</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>18</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>the incremental cost-effectiveness ratio per p...</td>\n",
              "      <td>9</td>\n",
              "      <td>12</td>\n",
              "      <td>RESULTS</td>\n",
              "      <td>0.806620</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>19</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>there was no significant difference in quality...</td>\n",
              "      <td>10</td>\n",
              "      <td>12</td>\n",
              "      <td>RESULTS</td>\n",
              "      <td>0.765841</td>\n",
              "      <td>True</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "         target  ... correct\n",
              "0    BACKGROUND  ...   False\n",
              "1       RESULTS  ...   False\n",
              "2       RESULTS  ...   False\n",
              "3       RESULTS  ...   False\n",
              "4       RESULTS  ...    True\n",
              "5       RESULTS  ...    True\n",
              "6       RESULTS  ...    True\n",
              "7   CONCLUSIONS  ...    True\n",
              "8   CONCLUSIONS  ...   False\n",
              "9    BACKGROUND  ...    True\n",
              "10   BACKGROUND  ...   False\n",
              "11      METHODS  ...    True\n",
              "12      METHODS  ...    True\n",
              "13      METHODS  ...    True\n",
              "14      RESULTS  ...    True\n",
              "15      RESULTS  ...    True\n",
              "16      RESULTS  ...    True\n",
              "17      RESULTS  ...    True\n",
              "18      RESULTS  ...    True\n",
              "19      RESULTS  ...    True\n",
              "\n",
              "[20 rows x 7 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 215
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "63aVn69B6sWe"
      },
      "source": [
        "Looking good! Having our data like this, makes it very easy to manipulate and view in different ways.\n",
        "\n",
        "How about we sort our DataFrame to find the samples with the highest `\"pred_prob\"` and where the prediction was wrong (`\"correct\" == False`)?"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 402
        },
        "id": "tDUOsuKJ6IQH",
        "outputId": "6141e272-6137-47c3-c9e2-676972cd748a"
      },
      "source": [
        "# Find top 100 most wrong samples (note: 100 is an abitrary number, you could go through all of them if you wanted)\n",
        "top_100_wrong = test_df[test_df[\"correct\"] == False].sort_values(\"pred_prob\", ascending=False)[:100]\n",
        "top_100_wrong"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>target</th>\n",
              "      <th>text</th>\n",
              "      <th>line_number</th>\n",
              "      <th>total_lines</th>\n",
              "      <th>prediction</th>\n",
              "      <th>pred_prob</th>\n",
              "      <th>correct</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>16347</th>\n",
              "      <td>BACKGROUND</td>\n",
              "      <td>to evaluate the effects of the lactic acid bac...</td>\n",
              "      <td>0</td>\n",
              "      <td>12</td>\n",
              "      <td>OBJECTIVE</td>\n",
              "      <td>0.938932</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3573</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>a cluster randomised trial was implemented wit...</td>\n",
              "      <td>3</td>\n",
              "      <td>16</td>\n",
              "      <td>METHODS</td>\n",
              "      <td>0.934583</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>13874</th>\n",
              "      <td>CONCLUSIONS</td>\n",
              "      <td>symptom outcomes will be assessed and estimate...</td>\n",
              "      <td>4</td>\n",
              "      <td>6</td>\n",
              "      <td>METHODS</td>\n",
              "      <td>0.929205</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>29294</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>baseline measures included sociodemographics ,...</td>\n",
              "      <td>4</td>\n",
              "      <td>13</td>\n",
              "      <td>METHODS</td>\n",
              "      <td>0.915164</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>835</th>\n",
              "      <td>BACKGROUND</td>\n",
              "      <td>to assess the temporal patterns of late gastro...</td>\n",
              "      <td>0</td>\n",
              "      <td>11</td>\n",
              "      <td>OBJECTIVE</td>\n",
              "      <td>0.907444</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>...</th>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "      <td>...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>7823</th>\n",
              "      <td>CONCLUSIONS</td>\n",
              "      <td>at @ year , mortality rates in the pi and ppci...</td>\n",
              "      <td>8</td>\n",
              "      <td>10</td>\n",
              "      <td>RESULTS</td>\n",
              "      <td>0.825351</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>8638</th>\n",
              "      <td>METHODS</td>\n",
              "      <td>this study is registered with clinicaltrials.g...</td>\n",
              "      <td>5</td>\n",
              "      <td>9</td>\n",
              "      <td>RESULTS</td>\n",
              "      <td>0.824962</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2605</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>circulating epc ( cells positive for cd@ , cd@...</td>\n",
              "      <td>4</td>\n",
              "      <td>10</td>\n",
              "      <td>METHODS</td>\n",
              "      <td>0.824808</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3859</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>this suggests that generalisation of habituati...</td>\n",
              "      <td>9</td>\n",
              "      <td>11</td>\n",
              "      <td>CONCLUSIONS</td>\n",
              "      <td>0.824272</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>10089</th>\n",
              "      <td>RESULTS</td>\n",
              "      <td>qualitative findings in the community trial al...</td>\n",
              "      <td>10</td>\n",
              "      <td>19</td>\n",
              "      <td>CONCLUSIONS</td>\n",
              "      <td>0.823939</td>\n",
              "      <td>False</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>100 rows × 7 columns</p>\n",
              "</div>"
            ],
            "text/plain": [
              "            target  ... correct\n",
              "16347   BACKGROUND  ...   False\n",
              "3573       RESULTS  ...   False\n",
              "13874  CONCLUSIONS  ...   False\n",
              "29294      RESULTS  ...   False\n",
              "835     BACKGROUND  ...   False\n",
              "...            ...  ...     ...\n",
              "7823   CONCLUSIONS  ...   False\n",
              "8638       METHODS  ...   False\n",
              "2605       RESULTS  ...   False\n",
              "3859       RESULTS  ...   False\n",
              "10089      RESULTS  ...   False\n",
              "\n",
              "[100 rows x 7 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 216
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yKc6UJwZ7dHu"
      },
      "source": [
        "Great (or not so great)! Now we've got a subset of our model's most wrong predictions, let's write some code to visualize them."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ysddyYy717HJ",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "1924007c-b380-40f0-c2ba-9589ad090525"
      },
      "source": [
        "# Investigate top wrong preds\n",
        "for row in top_100_wrong[0:10].itertuples(): # adjust indexes to view different samples\n",
        "  _, target, text, line_number, total_lines, prediction, pred_prob, _ = row\n",
        "  print(f\"Target: {target}, Pred: {prediction}, Prob: {pred_prob}, Line number: {line_number}, Total lines: {total_lines}\\n\")\n",
        "  print(f\"Text:\\n{text}\\n\")\n",
        "  print(\"-----\\n\")"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Target: BACKGROUND, Pred: OBJECTIVE, Prob: 0.9389324188232422, Line number: 0, Total lines: 12\n",
            "\n",
            "Text:\n",
            "to evaluate the effects of the lactic acid bacterium lactobacillus salivarius on caries risk factors .\n",
            "\n",
            "-----\n",
            "\n",
            "Target: RESULTS, Pred: METHODS, Prob: 0.9345834255218506, Line number: 3, Total lines: 16\n",
            "\n",
            "Text:\n",
            "a cluster randomised trial was implemented with @,@ children in @ government primary schools on the south coast of kenya in @-@ .\n",
            "\n",
            "-----\n",
            "\n",
            "Target: CONCLUSIONS, Pred: METHODS, Prob: 0.9292047619819641, Line number: 4, Total lines: 6\n",
            "\n",
            "Text:\n",
            "symptom outcomes will be assessed and estimates of cost-effectiveness made .\n",
            "\n",
            "-----\n",
            "\n",
            "Target: RESULTS, Pred: METHODS, Prob: 0.9151636958122253, Line number: 4, Total lines: 13\n",
            "\n",
            "Text:\n",
            "baseline measures included sociodemographics , standardized anthropometrics , asthma control test ( act ) , gerd symptom assessment scale , pittsburgh sleep quality index , and berlin questionnaire for sleep apnea .\n",
            "\n",
            "-----\n",
            "\n",
            "Target: BACKGROUND, Pred: OBJECTIVE, Prob: 0.9074438810348511, Line number: 0, Total lines: 11\n",
            "\n",
            "Text:\n",
            "to assess the temporal patterns of late gastrointestinal ( gi ) and genitourinary ( gu ) radiotherapy toxicity and resolution rates in a randomised controlled trial ( all-ireland cooperative oncology research group @-@ ) assessing duration of neo-adjuvant ( na ) hormone therapy for localised prostate cancer .\n",
            "\n",
            "-----\n",
            "\n",
            "Target: RESULTS, Pred: METHODS, Prob: 0.905343234539032, Line number: 3, Total lines: 13\n",
            "\n",
            "Text:\n",
            "data were collected prospectively for @ months beginning after completion of the first @ group clinic appointments ( @ months post randomization ) .\n",
            "\n",
            "-----\n",
            "\n",
            "Target: METHODS, Pred: OBJECTIVE, Prob: 0.9049059748649597, Line number: 0, Total lines: 7\n",
            "\n",
            "Text:\n",
            "to determine whether the insulin resistance that exists in metabolic syndrome ( mets ) patients is modulated by dietary fat composition .\n",
            "\n",
            "-----\n",
            "\n",
            "Target: BACKGROUND, Pred: OBJECTIVE, Prob: 0.90473872423172, Line number: 0, Total lines: 9\n",
            "\n",
            "Text:\n",
            "to compare the efficacy of the newcastle infant dialysis and ultrafiltration system ( nidus ) with peritoneal dialysis ( pd ) and conventional haemodialysis ( hd ) in infants weighing < @ kg .\n",
            "\n",
            "-----\n",
            "\n",
            "Target: METHODS, Pred: RESULTS, Prob: 0.9013399481773376, Line number: 5, Total lines: 7\n",
            "\n",
            "Text:\n",
            "at this time , an as@ response was achieved by @ ( @ % ) and @ ( @ % ) patients in groups @ and @ , respectively ( p < @ for all ) .\n",
            "\n",
            "-----\n",
            "\n",
            "Target: CONCLUSIONS, Pred: RESULTS, Prob: 0.8980520367622375, Line number: 7, Total lines: 9\n",
            "\n",
            "Text:\n",
            "pdt was associated with a significant decrease in bleeding scores ( p = @ ) as well as inflammatory exudation ( p = @ ) .\n",
            "\n",
            "-----\n",
            "\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wMbQeX-M7sYV"
      },
      "source": [
        "What do you notice about the most wrong predictions? Does the model make silly mistakes? Or are some of the labels incorrect/ambiguous (e.g. a line in an abstract could potentially be labelled `OBJECTIVE` or `BACKGROUND` and make sense).\n",
        "\n",
        "A next step here would be if there are a fair few samples with inconsistent labels, you could go through your training dataset, update the labels and then retrain a model. The process of using a model to help improve/investigate your dataset's labels is often referred to as **active learning**."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Pfz_b-Tmapdz"
      },
      "source": [
        "## Make example predictions\n",
        "\n",
        "Okay, we've made some predictions on the test dataset, now's time to really test our model out.\n",
        "\n",
        "To do so, we're going to get some data from the wild and see how our model performs.\n",
        "\n",
        "In other words, were going to find an RCT abstract from PubMed, preprocess the text so it works with our model, then pass each sequence in the wild abstract through our model to see what label it predicts.\n",
        "\n",
        "For an appropriate sample, we'll need to search PubMed for RCT's (randomized controlled trials) without abstracts which have been split up (on exploring PubMed you'll notice many of the abstracts are already preformatted into separate sections, this helps dramatically with readability).\n",
        "\n",
        "Going through various PubMed studies, I managed to find the following unstructured abstract from [*RCT of a manualized social treatment for high-functioning autism spectrum disorders*](https://pubmed.ncbi.nlm.nih.gov/20232240/):\n",
        "\n",
        "> This RCT examined the efficacy of a manualized social intervention for children with HFASDs. Participants were randomly assigned to treatment or wait-list conditions. Treatment included instruction and therapeutic activities targeting social skills, face-emotion recognition, interest expansion, and interpretation of non-literal language. A response-cost program was applied to reduce problem behaviors and foster skills acquisition. Significant treatment effects were found for five of seven primary outcome measures (parent ratings and direct child measures). Secondary measures based on staff ratings (treatment group only) corroborated gains reported by parents. High levels of parent, child and staff satisfaction were reported, along with high levels of treatment fidelity. Standardized effect size estimates were primarily in the medium and large ranges and favored the treatment group.\n",
        "\n",
        "Looking at the large chunk of text can seem quite intimidating. Now imagine you're a medical researcher trying to skim through the literature to find a study relevant to your work.\n",
        "\n",
        "Sounds like quite the challenge right?\n",
        "\n",
        "Enter SkimLit 🤓🔥!\n",
        "\n",
        "Let's see what our best model so far (`model_5`) makes of the above abstract.\n",
        "\n",
        "But wait...\n",
        "\n",
        "As you might've guessed the above abstract hasn't been formatted in the same structure as the data our model has been trained on. Therefore, before we can make a prediction on it, we need to preprocess it just as we have our other sequences.\n",
        "\n",
        "More specifically, for each abstract, we'll need to:\n",
        "\n",
        "1. Split it into sentences (lines).\n",
        "2. Split it into characters.\n",
        "3. Find the number of each line.\n",
        "4. Find the total number of lines.\n",
        "\n",
        "Starting with number 1, there are a couple of ways to split our abstracts into actual sentences. A simple one would be to use Python's in-built `split()` string method, splitting the abstract wherever a fullstop appears. However, can you imagine where this might go wrong? \n",
        "\n",
        "Another more advanced option would be to leverage [spaCy's](https://spacy.io/) (a very powerful NLP library) [`sentencizer`](https://spacy.io/usage/linguistic-features#sbd) class. Which is an easy to use sentence splitter based on spaCy's English language model.\n",
        "\n",
        "I've prepared some abstracts from PubMed RCT papers to try our model on, we can download them [from GitHub](https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/skimlit_example_abstracts.json).\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "-qKFXysU9Y1j",
        "outputId": "526894d9-9f36-450e-b20f-6fca58684b3a"
      },
      "source": [
        "# Download and open example abstracts (copy and pasted from PubMed)\n",
        "!wget https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/skimlit_example_abstracts.json\n",
        "\n",
        "with open(\"skimlit_example_abstracts.json\", \"r\") as f:\n",
        "  example_abstracts = json.load(f)\n",
        "\n",
        "example_abstracts"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "--2021-04-02 05:04:14--  https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/skimlit_example_abstracts.json\n",
            "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n",
            "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n",
            "HTTP request sent, awaiting response... 200 OK\n",
            "Length: 6737 (6.6K) [text/plain]\n",
            "Saving to: ‘skimlit_example_abstracts.json’\n",
            "\n",
            "\r          skimlit_e   0%[                    ]       0  --.-KB/s               \rskimlit_example_abs 100%[===================>]   6.58K  --.-KB/s    in 0s      \n",
            "\n",
            "2021-04-02 05:04:14 (97.3 MB/s) - ‘skimlit_example_abstracts.json’ saved [6737/6737]\n",
            "\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "[{'abstract': 'This RCT examined the efficacy of a manualized social intervention for children with HFASDs. Participants were randomly assigned to treatment or wait-list conditions. Treatment included instruction and therapeutic activities targeting social skills, face-emotion recognition, interest expansion, and interpretation of non-literal language. A response-cost program was applied to reduce problem behaviors and foster skills acquisition. Significant treatment effects were found for five of seven primary outcome measures (parent ratings and direct child measures). Secondary measures based on staff ratings (treatment group only) corroborated gains reported by parents. High levels of parent, child and staff satisfaction were reported, along with high levels of treatment fidelity. Standardized effect size estimates were primarily in the medium and large ranges and favored the treatment group.',\n",
              "  'details': 'RCT of a manualized social treatment for high-functioning autism spectrum disorders',\n",
              "  'source': 'https://pubmed.ncbi.nlm.nih.gov/20232240/'},\n",
              " {'abstract': \"Postpartum depression (PPD) is the most prevalent mood disorder associated with childbirth. No single cause of PPD has been identified, however the increased risk of nutritional deficiencies incurred through the high nutritional requirements of pregnancy may play a role in the pathology of depressive symptoms. Three nutritional interventions have drawn particular interest as possible non-invasive and cost-effective prevention and/or treatment strategies for PPD; omega-3 (n-3) long chain polyunsaturated fatty acids (LCPUFA), vitamin D and overall diet. We searched for meta-analyses of randomised controlled trials (RCT's) of nutritional interventions during the perinatal period with PPD as an outcome, and checked for any trials published subsequently to the meta-analyses. Fish oil: Eleven RCT's of prenatal fish oil supplementation RCT's show null and positive effects on PPD symptoms. Vitamin D: no relevant RCT's were identified, however seven observational studies of maternal vitamin D levels with PPD outcomes showed inconsistent associations. Diet: Two Australian RCT's with dietary advice interventions in pregnancy had a positive and null result on PPD. With the exception of fish oil, few RCT's with nutritional interventions during pregnancy assess PPD. Further research is needed to determine whether nutritional intervention strategies during pregnancy can protect against symptoms of PPD. Given the prevalence of PPD and ease of administering PPD measures, we recommend future prenatal nutritional RCT's include PPD as an outcome.\",\n",
              "  'details': 'Formatting removed (can be used to compare model to actual example)',\n",
              "  'source': 'https://pubmed.ncbi.nlm.nih.gov/28012571/'},\n",
              " {'abstract': 'Mental illness, including depression, anxiety and bipolar disorder, accounts for a significant proportion of global disability and poses a substantial social, economic and heath burden. Treatment is presently dominated by pharmacotherapy, such as antidepressants, and psychotherapy, such as cognitive behavioural therapy; however, such treatments avert less than half of the disease burden, suggesting that additional strategies are needed to prevent and treat mental disorders. There are now consistent mechanistic, observational and interventional data to suggest diet quality may be a modifiable risk factor for mental illness. This review provides an overview of the nutritional psychiatry field. It includes a discussion of the neurobiological mechanisms likely modulated by diet, the use of dietary and nutraceutical interventions in mental disorders, and recommendations for further research. Potential biological pathways related to mental disorders include inflammation, oxidative stress, the gut microbiome, epigenetic modifications and neuroplasticity. Consistent epidemiological evidence, particularly for depression, suggests an association between measures of diet quality and mental health, across multiple populations and age groups; these do not appear to be explained by other demographic, lifestyle factors or reverse causality. Our recently published intervention trial provides preliminary clinical evidence that dietary interventions in clinically diagnosed populations are feasible and can provide significant clinical benefit. Furthermore, nutraceuticals including n-3 fatty acids, folate, S-adenosylmethionine, N-acetyl cysteine and probiotics, among others, are promising avenues for future research. Continued research is now required to investigate the efficacy of intervention studies in large cohorts and within clinically relevant populations, particularly in patients with schizophrenia, bipolar and anxiety disorders.',\n",
              "  'details': 'Effect of nutrition on mental health',\n",
              "  'source': 'https://pubmed.ncbi.nlm.nih.gov/28942748/'},\n",
              " {'abstract': \"Hepatitis C virus (HCV) and alcoholic liver disease (ALD), either alone or in combination, count for more than two thirds of all liver diseases in the Western world. There is no safe level of drinking in HCV-infected patients and the most effective goal for these patients is total abstinence. Baclofen, a GABA(B) receptor agonist, represents a promising pharmacotherapy for alcohol dependence (AD). Previously, we performed a randomized clinical trial (RCT), which demonstrated the safety and efficacy of baclofen in patients affected by AD and cirrhosis. The goal of this post-hoc analysis was to explore baclofen's effect in a subgroup of alcohol-dependent HCV-infected cirrhotic patients. Any patient with HCV infection was selected for this analysis. Among the 84 subjects randomized in the main trial, 24 alcohol-dependent cirrhotic patients had a HCV infection; 12 received baclofen 10mg t.i.d. and 12 received placebo for 12-weeks. With respect to the placebo group (3/12, 25.0%), a significantly higher number of patients who achieved and maintained total alcohol abstinence was found in the baclofen group (10/12, 83.3%; p=0.0123). Furthermore, in the baclofen group, compared to placebo, there was a significantly higher increase in albumin values from baseline (p=0.0132) and a trend toward a significant reduction in INR levels from baseline (p=0.0716). In conclusion, baclofen was safe and significantly more effective than placebo in promoting alcohol abstinence, and improving some Liver Function Tests (LFTs) (i.e. albumin, INR) in alcohol-dependent HCV-infected cirrhotic patients. Baclofen may represent a clinically relevant alcohol pharmacotherapy for these patients.\",\n",
              "  'details': 'Baclofen promotes alcohol abstinence in alcohol dependent cirrhotic patients with hepatitis C virus (HCV) infection',\n",
              "  'source': 'https://pubmed.ncbi.nlm.nih.gov/22244707/'}]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 249
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 283
        },
        "id": "-1cIAS1Z6r_l",
        "outputId": "b08d8b74-0187-469f-99ad-d9f77145f4fd"
      },
      "source": [
        "# See what our example abstracts look like\n",
        "abstracts = pd.DataFrame(example_abstracts)\n",
        "abstracts"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>abstract</th>\n",
              "      <th>source</th>\n",
              "      <th>details</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>This RCT examined the efficacy of a manualized...</td>\n",
              "      <td>https://pubmed.ncbi.nlm.nih.gov/20232240/</td>\n",
              "      <td>RCT of a manualized social treatment for high-...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>Postpartum depression (PPD) is the most preval...</td>\n",
              "      <td>https://pubmed.ncbi.nlm.nih.gov/28012571/</td>\n",
              "      <td>Formatting removed (can be used to compare mod...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Mental illness, including depression, anxiety ...</td>\n",
              "      <td>https://pubmed.ncbi.nlm.nih.gov/28942748/</td>\n",
              "      <td>Effect of nutrition on mental health</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>Hepatitis C virus (HCV) and alcoholic liver di...</td>\n",
              "      <td>https://pubmed.ncbi.nlm.nih.gov/22244707/</td>\n",
              "      <td>Baclofen promotes alcohol abstinence in alcoho...</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "                                            abstract  ...                                            details\n",
              "0  This RCT examined the efficacy of a manualized...  ...  RCT of a manualized social treatment for high-...\n",
              "1  Postpartum depression (PPD) is the most preval...  ...  Formatting removed (can be used to compare mod...\n",
              "2  Mental illness, including depression, anxiety ...  ...               Effect of nutrition on mental health\n",
              "3  Hepatitis C virus (HCV) and alcoholic liver di...  ...  Baclofen promotes alcohol abstinence in alcoho...\n",
              "\n",
              "[4 rows x 3 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 250
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CnZWtDki9uxc"
      },
      "source": [
        "Now we've downloaded some example abstracts, let's see how one of them goes with our trained model.\n",
        "\n",
        "First, we'll need to parse it using spaCy to turn it from a big chunk of text into sentences."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_gwVNdLQHpQX",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "9ed1f3cf-333f-47e5-8030-ecf4b94a820e"
      },
      "source": [
        "# Create sentencizer - Source: https://spacy.io/usage/linguistic-features#sbd \n",
        "from spacy.lang.en import English\n",
        "nlp = English() # setup English sentence parser\n",
        "sentencizer = nlp.create_pipe(\"sentencizer\") # create sentence splitting pipeline object\n",
        "nlp.add_pipe(sentencizer) # add sentence splitting pipeline object to sentence parser\n",
        "doc = nlp(example_abstracts[0][\"abstract\"]) # create \"doc\" of parsed sequences, change index for a different abstract\n",
        "abstract_lines = [str(sent) for sent in list(doc.sents)] # return detected sentences from doc in string type (not spaCy token type)\n",
        "abstract_lines"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "['This RCT examined the efficacy of a manualized social intervention for children with HFASDs.',\n",
              " 'Participants were randomly assigned to treatment or wait-list conditions.',\n",
              " 'Treatment included instruction and therapeutic activities targeting social skills, face-emotion recognition, interest expansion, and interpretation of non-literal language.',\n",
              " 'A response-cost program was applied to reduce problem behaviors and foster skills acquisition.',\n",
              " 'Significant treatment effects were found for five of seven primary outcome measures (parent ratings and direct child measures).',\n",
              " 'Secondary measures based on staff ratings (treatment group only) corroborated gains reported by parents.',\n",
              " 'High levels of parent, child and staff satisfaction were reported, along with high levels of treatment fidelity.',\n",
              " 'Standardized effect size estimates were primarily in the medium and large ranges and favored the treatment group.']"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 251
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "UiCg-H4G16Gx"
      },
      "source": [
        "Beautiful! It looks like spaCy has split the sentences in the abstract correctly. However, it should be noted, there may be more complex abstracts which don't get split perfectly into separate sentences (such as the example in [*Baclofen promotes alcohol abstinence in alcohol dependent cirrhotic patients with hepatitis C virus (HCV) infection*](https://pubmed.ncbi.nlm.nih.gov/22244707/)), in this case, more custom splitting techniques would have to be investigated.\n",
        "\n",
        "Now our abstract has been split into sentences, how about we write some code to count line numbers as well as total lines.\n",
        "\n",
        "To do so, we can leverage some of the functionality of our `preprocess_text_with_line_numbers()` function."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "A_Hi0alJI4Xu",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "db99cde3-b064-4e27-a8ef-7a7157d5b247"
      },
      "source": [
        "# Get total number of lines\n",
        "total_lines_in_sample = len(abstract_lines)\n",
        "\n",
        "# Go through each line in abstract and create a list of dictionaries containing features for each line\n",
        "sample_lines = []\n",
        "for i, line in enumerate(abstract_lines):\n",
        "  sample_dict = {}\n",
        "  sample_dict[\"text\"] = str(line)\n",
        "  sample_dict[\"line_number\"] = i\n",
        "  sample_dict[\"total_lines\"] = total_lines_in_sample - 1\n",
        "  sample_lines.append(sample_dict)\n",
        "sample_lines"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "[{'line_number': 0,\n",
              "  'text': 'This RCT examined the efficacy of a manualized social intervention for children with HFASDs.',\n",
              "  'total_lines': 7},\n",
              " {'line_number': 1,\n",
              "  'text': 'Participants were randomly assigned to treatment or wait-list conditions.',\n",
              "  'total_lines': 7},\n",
              " {'line_number': 2,\n",
              "  'text': 'Treatment included instruction and therapeutic activities targeting social skills, face-emotion recognition, interest expansion, and interpretation of non-literal language.',\n",
              "  'total_lines': 7},\n",
              " {'line_number': 3,\n",
              "  'text': 'A response-cost program was applied to reduce problem behaviors and foster skills acquisition.',\n",
              "  'total_lines': 7},\n",
              " {'line_number': 4,\n",
              "  'text': 'Significant treatment effects were found for five of seven primary outcome measures (parent ratings and direct child measures).',\n",
              "  'total_lines': 7},\n",
              " {'line_number': 5,\n",
              "  'text': 'Secondary measures based on staff ratings (treatment group only) corroborated gains reported by parents.',\n",
              "  'total_lines': 7},\n",
              " {'line_number': 6,\n",
              "  'text': 'High levels of parent, child and staff satisfaction were reported, along with high levels of treatment fidelity.',\n",
              "  'total_lines': 7},\n",
              " {'line_number': 7,\n",
              "  'text': 'Standardized effect size estimates were primarily in the medium and large ranges and favored the treatment group.',\n",
              "  'total_lines': 7}]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 252
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "17X7ez2r37Nw"
      },
      "source": [
        "Now we've got `\"line_number\"` and `\"total_lines\"` values, we can one-hot encode them with `tf.one_hot` just like we did with our training dataset (using the same values for the `depth` parameter)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "rm0MYaAnBkbp",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "e088e47a-6528-443d-8563-7e0b329fdd49"
      },
      "source": [
        "# Get all line_number values from sample abstract\n",
        "test_abstract_line_numbers = [line[\"line_number\"] for line in sample_lines]\n",
        "# One-hot encode to same depth as training data, so model accepts right input shape\n",
        "test_abstract_line_numbers_one_hot = tf.one_hot(test_abstract_line_numbers, depth=15) \n",
        "test_abstract_line_numbers_one_hot"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<tf.Tensor: shape=(8, 15), dtype=float32, numpy=\n",
              "array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "       [0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "       [0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "       [0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "       [0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],\n",
              "       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0.]],\n",
              "      dtype=float32)>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 254
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "8Wzbv3w6B3OU",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "693528f5-fe5a-4705-bd3a-3a4445f1134a"
      },
      "source": [
        "# Get all total_lines values from sample abstract\n",
        "test_abstract_total_lines = [line[\"total_lines\"] for line in sample_lines]\n",
        "# One-hot encode to same depth as training data, so model accepts right input shape\n",
        "test_abstract_total_lines_one_hot = tf.one_hot(test_abstract_total_lines, depth=20)\n",
        "test_abstract_total_lines_one_hot"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<tf.Tensor: shape=(8, 20), dtype=float32, numpy=\n",
              "array([[0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
              "        0., 0., 0., 0.],\n",
              "       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
              "        0., 0., 0., 0.],\n",
              "       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
              "        0., 0., 0., 0.],\n",
              "       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
              "        0., 0., 0., 0.],\n",
              "       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
              "        0., 0., 0., 0.],\n",
              "       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
              "        0., 0., 0., 0.],\n",
              "       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
              "        0., 0., 0., 0.],\n",
              "       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.,\n",
              "        0., 0., 0., 0.]], dtype=float32)>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 255
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Wq-f17G440ur"
      },
      "source": [
        "We can also use our `split_chars()` function to split our abstract lines into characters."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "HOOPoG3cCA0F",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "0a1a31f0-4b4c-4d5c-f735-45becd2d11cc"
      },
      "source": [
        "# Split abstract lines into characters\n",
        "abstract_chars = [split_chars(sentence) for sentence in abstract_lines]\n",
        "abstract_chars"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "['T h i s   R C T   e x a m i n e d   t h e   e f f i c a c y   o f   a   m a n u a l i z e d   s o c i a l   i n t e r v e n t i o n   f o r   c h i l d r e n   w i t h   H F A S D s .',\n",
              " 'P a r t i c i p a n t s   w e r e   r a n d o m l y   a s s i g n e d   t o   t r e a t m e n t   o r   w a i t - l i s t   c o n d i t i o n s .',\n",
              " 'T r e a t m e n t   i n c l u d e d   i n s t r u c t i o n   a n d   t h e r a p e u t i c   a c t i v i t i e s   t a r g e t i n g   s o c i a l   s k i l l s ,   f a c e - e m o t i o n   r e c o g n i t i o n ,   i n t e r e s t   e x p a n s i o n ,   a n d   i n t e r p r e t a t i o n   o f   n o n - l i t e r a l   l a n g u a g e .',\n",
              " 'A   r e s p o n s e - c o s t   p r o g r a m   w a s   a p p l i e d   t o   r e d u c e   p r o b l e m   b e h a v i o r s   a n d   f o s t e r   s k i l l s   a c q u i s i t i o n .',\n",
              " 'S i g n i f i c a n t   t r e a t m e n t   e f f e c t s   w e r e   f o u n d   f o r   f i v e   o f   s e v e n   p r i m a r y   o u t c o m e   m e a s u r e s   ( p a r e n t   r a t i n g s   a n d   d i r e c t   c h i l d   m e a s u r e s ) .',\n",
              " 'S e c o n d a r y   m e a s u r e s   b a s e d   o n   s t a f f   r a t i n g s   ( t r e a t m e n t   g r o u p   o n l y )   c o r r o b o r a t e d   g a i n s   r e p o r t e d   b y   p a r e n t s .',\n",
              " 'H i g h   l e v e l s   o f   p a r e n t ,   c h i l d   a n d   s t a f f   s a t i s f a c t i o n   w e r e   r e p o r t e d ,   a l o n g   w i t h   h i g h   l e v e l s   o f   t r e a t m e n t   f i d e l i t y .',\n",
              " 'S t a n d a r d i z e d   e f f e c t   s i z e   e s t i m a t e s   w e r e   p r i m a r i l y   i n   t h e   m e d i u m   a n d   l a r g e   r a n g e s   a n d   f a v o r e d   t h e   t r e a t m e n t   g r o u p .']"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 256
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5MO7_-Hx5FvS"
      },
      "source": [
        "Alright, now we've preprocessed our wild RCT abstract into all of the same features our model was trained on, we can pass these features to our model and make sequence label predictions!"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "0b7siZa1CQG7",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "5a5d7896-c83f-4288-d5c4-95d28af30772"
      },
      "source": [
        "# Make predictions on sample abstract features\n",
        "%%time\n",
        "test_abstract_pred_probs = loaded_model.predict(x=(test_abstract_line_numbers_one_hot,\n",
        "                                                   test_abstract_total_lines_one_hot,\n",
        "                                                   tf.constant(abstract_lines),\n",
        "                                                   tf.constant(abstract_chars)))\n",
        "test_abstract_pred_probs"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "CPU times: user 75.2 ms, sys: 8.94 ms, total: 84.1 ms\n",
            "Wall time: 73.9 ms\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "8nxqfCBfCqWe",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "e0e7b6a2-329c-4b77-af04-63742a758309"
      },
      "source": [
        "# Turn prediction probabilities into prediction classes\n",
        "test_abstract_preds = tf.argmax(test_abstract_pred_probs, axis=1)\n",
        "test_abstract_preds"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "<tf.Tensor: shape=(8,), dtype=int64, numpy=array([3, 2, 2, 2, 4, 2, 4, 4])>"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 258
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "tSOOV4bp5sZI"
      },
      "source": [
        "Now we've got the predicted sequence label for each line in our sample abstract, let's write some code to visualize each sentence with its predicted label."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "LduhApa3C1mD",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "7bac11a4-4a57-46b7-e1d0-36455dfff602"
      },
      "source": [
        "# Turn prediction class integers into string class names\n",
        "test_abstract_pred_classes = [label_encoder.classes_[i] for i in test_abstract_preds]\n",
        "test_abstract_pred_classes"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "['OBJECTIVE',\n",
              " 'METHODS',\n",
              " 'METHODS',\n",
              " 'METHODS',\n",
              " 'RESULTS',\n",
              " 'METHODS',\n",
              " 'RESULTS',\n",
              " 'RESULTS']"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 259
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "BhhDPZSHDCJD",
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "outputId": "85db53a7-9ffc-4e54-b5cb-2d1bb69e37fe"
      },
      "source": [
        "# Visualize abstract lines and predicted sequence labels\n",
        "for i, line in enumerate(abstract_lines):\n",
        "  print(f\"{test_abstract_pred_classes[i]}: {line}\")"
      ],
      "execution_count": null,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "OBJECTIVE: This RCT examined the efficacy of a manualized social intervention for children with HFASDs.\n",
            "METHODS: Participants were randomly assigned to treatment or wait-list conditions.\n",
            "METHODS: Treatment included instruction and therapeutic activities targeting social skills, face-emotion recognition, interest expansion, and interpretation of non-literal language.\n",
            "METHODS: A response-cost program was applied to reduce problem behaviors and foster skills acquisition.\n",
            "RESULTS: Significant treatment effects were found for five of seven primary outcome measures (parent ratings and direct child measures).\n",
            "METHODS: Secondary measures based on staff ratings (treatment group only) corroborated gains reported by parents.\n",
            "RESULTS: High levels of parent, child and staff satisfaction were reported, along with high levels of treatment fidelity.\n",
            "RESULTS: Standardized effect size estimates were primarily in the medium and large ranges and favored the treatment group.\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vCQVQ5DAKz4M"
      },
      "source": [
        "Nice! Isn't that much easier to read? I mean, it looks like our model's predictions could be improved, but how cool is that?\n",
        "\n",
        "Imagine implementing our model to the backend of the PubMed website to format any unstructured RCT abstract on the site.\n",
        "\n",
        "Or there could even be a browser extension, called \"SkimLit\" which would add structure (powered by our model) to any unstructured RCT abtract.\n",
        "\n",
        "And if showed your medical researcher friend, and they thought the predictions weren't up to standard, there could be a button saying \"is this label correct?... if not, what should it be?\". That way the dataset, along with our model's future predictions, could be improved over time.\n",
        "\n",
        "Of course, there are many more ways we could go to improve the model, the usuability, the preprocessing functionality (e.g. functionizing our sample abstract preprocessing pipeline) but I'll leave these for the exercises/extensions.\n",
        "\n",
        "> 🤔 **Question:** How can we be sure the results of our test example from the wild are truly *wild*? Is there something we should check about the sample we're testing on?"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bLNyY9OoLEYL"
      },
      "source": [
        "## 🛠 Exercises\n",
        "\n",
        "1. Train `model_5` on all of the data in the training dataset for as many epochs until it stops improving. Since this might take a while, you might want to use:\n",
        "  * [`tf.keras.callbacks.ModelCheckpoint`](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/ModelCheckpoint) to save the model's best weights only.\n",
        "  * [`tf.keras.callbacks.EarlyStopping`](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/EarlyStopping) to stop the model from training once the validation loss has stopped improving for ~3 epochs.\n",
        "2. Checkout the [Keras guide on using pretrained GloVe embeddings](https://keras.io/examples/nlp/pretrained_word_embeddings/). Can you get this working with one of our models?\n",
        "  * Hint: You'll want to incorporate it with a custom token [Embedding](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Embedding) layer.\n",
        "  * It's up to you whether or not you fine-tune the GloVe embeddings or leave them frozen.\n",
        "3. Try replacing the TensorFlow Hub Universal Sentence Encoder pretrained  embedding for the [TensorFlow Hub BERT PubMed expert](https://tfhub.dev/google/experts/bert/pubmed/2) (a language model pretrained on PubMed texts) pretrained embedding. Does this effect results?\n",
        "  * Note: Using the BERT PubMed expert pretrained embedding requires an extra preprocessing step for sequences (as detailed in the [TensorFlow Hub guide](https://tfhub.dev/google/experts/bert/pubmed/2)).\n",
        "  * Does the BERT model beat the results mentioned in this paper? https://arxiv.org/pdf/1710.06071.pdf \n",
        "4. What happens if you were to merge our `line_number` and `total_lines` features for each sequence? For example, created a `X_of_Y` feature instead? Does this effect model performance?\n",
        "  * Another example: `line_number=1` and `total_lines=11` turns into `line_of_X=1_of_11`.\n",
        "5. Write a function (or series of functions) to take a sample abstract string, preprocess it (in the same way our model has been trained), make a prediction on each sequence in the abstract and return the abstract in the format:\n",
        "  * `PREDICTED_LABEL`: `SEQUENCE`\n",
        "  * `PREDICTED_LABEL`: `SEQUENCE`\n",
        "  * `PREDICTED_LABEL`: `SEQUENCE`\n",
        "  * `PREDICTED_LABEL`: `SEQUENCE`\n",
        "  * ...\n",
        "    * You can find your own unstrcutured RCT abstract from PubMed or try this one from: [*Baclofen promotes alcohol abstinence in alcohol dependent cirrhotic patients with hepatitis C virus (HCV) infection*](https://pubmed.ncbi.nlm.nih.gov/22244707/)."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "O6E8rcjKrLzY"
      },
      "source": [
        "## 📖 Extra-curriculum\n",
        "* For more on working with text/spaCy, see [spaCy's advanced NLP course](https://course.spacy.io/en/). If you're going to be working on production-level NLP problems, you'll probably end up using spaCy.\n",
        "* For another look at how to approach a text classification problem like the one we've just gone through, I'd suggest going through [Google's Machine Learning Course for text classification](https://developers.google.com/machine-learning/guides/text-classification). \n",
        "* Since our dataset has imbalanced classes (as with many real-world datasets), so it might be worth looking into the [TensorFlow guide for different methods to training a model with imbalanced classes](https://www.tensorflow.org/tutorials/structured_data/imbalanced_data).\n"
      ]
    }
  ]
}